You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kudu.apache.org by Todd Lipcon <to...@cloudera.com> on 2016/10/31 23:41:30 UTC

Internal daemon-to-daemon authorization

Hey folks,

After https://gerrit.cloudera.org/#/c/4765/ finishes code review, the Kudu
daemons will have support for Kerberized RPC. If we start the servers with
--server_require_kerberos, then they'll require that all RPC clients
authenticate themselves via Kerberos.

Of course authentication isn't that useful on its own. I think the next
step along this direction is to add basic "internal service" authorization:

- the master should only allow tablet server service principals to
heartbeat/register/etc
- the tserver should only allow master service principals to create/delete
tablets, etc. (with the possible extension to also allow 'root' users to do
so)

I think there are a couple patterns we could follow here to enforce the
authorization

*Option 1) Try to enforce ACLs on a per-RPC-service level*

This is more or less the approach that Hadoop takes. We can make sure that
different RPC use cases are separated into different RPC "services", and
apply an ACL on each service. For example, we can separate the current
'MasterService' into 'TserverToMasterService' and 'ClientToMasterService',
and then register ACLs on each such service when we start the RPC server.

Pros:
- when adding a new method, it "automatically" gets the right ACL without
having to remember to add special checks
- enforces some nice separation and grouping of RPCs

Cons:
- doesn't handle special cases well, like 'DeleteTablet is usually called
by the master, but we also want to allow admins to call the same method for
use in manual recovery tools'
- would require a bunch of RPC methods to be moved around, and likely break
wire compatibility (I believe the RPC header currently specifies the actual
service name for each method call)

*Option 2) Entirely ad-hoc authorization checks*

At the top of each RPC method implementation, we can manually check the
authorization. Basically, imagine something like this:

bool CheckTabletServerRequest(RpcContext* rpc) {
  // if 'rpc' is from a valid tablet server, return true
  // otherwise, respond to the RPC with an authz error and return false
}

void MasterServiceImpl::TSHeartbeat(const TSHeartbeatRequestPB* req,
                                    TSHeartbeatResponsePB* resp,
                                    rpc::RpcContext* rpc) {
  if (!CheckTabletServerRequest(rpc)) return;
}


Pros:
- 100% flexible - easily allows for request-specific special cases
- relies on no "magic", it's obvious when you look at the code how the
authz check is applied.

Cons:
- easy to forget to check authorization when adding new RPCs. I recall a
CVE or two in the past where we forgot to add the appropriate check in
various internal Hadoop RPCs.


*Option 3) Per-method authorization checks, declared in .proto file*

In the .proto file describing a service, we can add a custom option like:

  rpc TSHeartbeat(TSHeartbeatRequestPB) returns (TSHeartbeatResponsePB) {
    option (kudu.authz = "tablet-servers-only")
  };

When we start the RPC system, we can bind these magic strings to actual
method implementations, and give an error if any RPC method either mentions
an invalid authz policy, or doesn't mention an authz policy at all.

Pros:
- we can require that all methods specify a policy (even if they specify
'none'), making it impossible to forget to authorize
- still retains flexibility: we can always create an exotic policy
implementation for an exotic method as necessary (or specify 'none' and do
the custom check from option 2)
- we can allow the same protobuf option to be specified as a service-wide
'default' for the cases where an entire service has the same authz policy

Cons:
- maybe a little more confusing to developers who have to go looking at the
.proto file to see which authorization policy is being applied to a method?

*Option X? Any other options I missed?*
- I only spent 30m or so thinking about this, so maybe I missed a better
option? Feedback appreciated.


I'm leaning towards option 3 here, but would appreciate feedback before I
go too far down that path.

-Todd
-- 
Todd Lipcon
Software Engineer, Cloudera

Re: Internal daemon-to-daemon authorization

Posted by Dan Burkert <da...@apache.org>.
I like the implementation; I think it walks a good line between being
uncumbersome but quickly auditable for missing authz on services.

- Dan

On Tue, Nov 1, 2016 at 11:51 AM, Brock Noland <br...@phdata.io> wrote:

> This looks really straightforward IMO. I think as a dev if you are
> modifying the wire protocol, understanding this should be no sweat.
>
> On Mon, Oct 31, 2016 at 9:24 PM, Todd Lipcon <to...@cloudera.com> wrote:
>
> > I whipped up a patch for something somewhat like option 3 here:
> > http://gerrit.cloudera.org:8080/4897
> >
> > Please take a look if you're interested.
> >
> > -Todd
> >
> > On Mon, Oct 31, 2016 at 4:41 PM, Todd Lipcon <to...@cloudera.com> wrote:
> >
> > > Hey folks,
> > >
> > > After https://gerrit.cloudera.org/#/c/4765/ finishes code review, the
> > > Kudu daemons will have support for Kerberized RPC. If we start the
> > servers
> > > with --server_require_kerberos, then they'll require that all RPC
> clients
> > > authenticate themselves via Kerberos.
> > >
> > > Of course authentication isn't that useful on its own. I think the next
> > > step along this direction is to add basic "internal service"
> > authorization:
> > >
> > > - the master should only allow tablet server service principals to
> > > heartbeat/register/etc
> > > - the tserver should only allow master service principals to
> > create/delete
> > > tablets, etc. (with the possible extension to also allow 'root' users
> to
> > do
> > > so)
> > >
> > > I think there are a couple patterns we could follow here to enforce the
> > > authorization
> > >
> > > *Option 1) Try to enforce ACLs on a per-RPC-service level*
> > >
> > > This is more or less the approach that Hadoop takes. We can make sure
> > that
> > > different RPC use cases are separated into different RPC "services",
> and
> > > apply an ACL on each service. For example, we can separate the current
> > > 'MasterService' into 'TserverToMasterService' and
> > 'ClientToMasterService',
> > > and then register ACLs on each such service when we start the RPC
> server.
> > >
> > > Pros:
> > > - when adding a new method, it "automatically" gets the right ACL
> without
> > > having to remember to add special checks
> > > - enforces some nice separation and grouping of RPCs
> > >
> > > Cons:
> > > - doesn't handle special cases well, like 'DeleteTablet is usually
> called
> > > by the master, but we also want to allow admins to call the same method
> > for
> > > use in manual recovery tools'
> > > - would require a bunch of RPC methods to be moved around, and likely
> > > break wire compatibility (I believe the RPC header currently specifies
> > the
> > > actual service name for each method call)
> > >
> > > *Option 2) Entirely ad-hoc authorization checks*
> > >
> > > At the top of each RPC method implementation, we can manually check the
> > > authorization. Basically, imagine something like this:
> > >
> > > bool CheckTabletServerRequest(RpcContext* rpc) {
> > >   // if 'rpc' is from a valid tablet server, return true
> > >   // otherwise, respond to the RPC with an authz error and return false
> > > }
> > >
> > > void MasterServiceImpl::TSHeartbeat(const TSHeartbeatRequestPB* req,
> > >                                     TSHeartbeatResponsePB* resp,
> > >                                     rpc::RpcContext* rpc) {
> > >   if (!CheckTabletServerRequest(rpc)) return;
> > > }
> > >
> > >
> > > Pros:
> > > - 100% flexible - easily allows for request-specific special cases
> > > - relies on no "magic", it's obvious when you look at the code how the
> > > authz check is applied.
> > >
> > > Cons:
> > > - easy to forget to check authorization when adding new RPCs. I recall
> a
> > > CVE or two in the past where we forgot to add the appropriate check in
> > > various internal Hadoop RPCs.
> > >
> > >
> > > *Option 3) Per-method authorization checks, declared in .proto file*
> > >
> > > In the .proto file describing a service, we can add a custom option
> like:
> > >
> > >   rpc TSHeartbeat(TSHeartbeatRequestPB) returns
> (TSHeartbeatResponsePB)
> > {
> > >     option (kudu.authz = "tablet-servers-only")
> > >   };
> > >
> > > When we start the RPC system, we can bind these magic strings to actual
> > > method implementations, and give an error if any RPC method either
> > mentions
> > > an invalid authz policy, or doesn't mention an authz policy at all.
> > >
> > > Pros:
> > > - we can require that all methods specify a policy (even if they
> specify
> > > 'none'), making it impossible to forget to authorize
> > > - still retains flexibility: we can always create an exotic policy
> > > implementation for an exotic method as necessary (or specify 'none' and
> > do
> > > the custom check from option 2)
> > > - we can allow the same protobuf option to be specified as a
> service-wide
> > > 'default' for the cases where an entire service has the same authz
> policy
> > >
> > > Cons:
> > > - maybe a little more confusing to developers who have to go looking at
> > > the .proto file to see which authorization policy is being applied to a
> > > method?
> > >
> > > *Option X? Any other options I missed?*
> > > - I only spent 30m or so thinking about this, so maybe I missed a
> better
> > > option? Feedback appreciated.
> > >
> > >
> > > I'm leaning towards option 3 here, but would appreciate feedback
> before I
> > > go too far down that path.
> > >
> > > -Todd
> > > --
> > > Todd Lipcon
> > > Software Engineer, Cloudera
> > >
> >
> >
> >
> > --
> > Todd Lipcon
> > Software Engineer, Cloudera
> >
>

Re: Internal daemon-to-daemon authorization

Posted by Brock Noland <br...@phdata.io>.
This looks really straightforward IMO. I think as a dev if you are
modifying the wire protocol, understanding this should be no sweat.

On Mon, Oct 31, 2016 at 9:24 PM, Todd Lipcon <to...@cloudera.com> wrote:

> I whipped up a patch for something somewhat like option 3 here:
> http://gerrit.cloudera.org:8080/4897
>
> Please take a look if you're interested.
>
> -Todd
>
> On Mon, Oct 31, 2016 at 4:41 PM, Todd Lipcon <to...@cloudera.com> wrote:
>
> > Hey folks,
> >
> > After https://gerrit.cloudera.org/#/c/4765/ finishes code review, the
> > Kudu daemons will have support for Kerberized RPC. If we start the
> servers
> > with --server_require_kerberos, then they'll require that all RPC clients
> > authenticate themselves via Kerberos.
> >
> > Of course authentication isn't that useful on its own. I think the next
> > step along this direction is to add basic "internal service"
> authorization:
> >
> > - the master should only allow tablet server service principals to
> > heartbeat/register/etc
> > - the tserver should only allow master service principals to
> create/delete
> > tablets, etc. (with the possible extension to also allow 'root' users to
> do
> > so)
> >
> > I think there are a couple patterns we could follow here to enforce the
> > authorization
> >
> > *Option 1) Try to enforce ACLs on a per-RPC-service level*
> >
> > This is more or less the approach that Hadoop takes. We can make sure
> that
> > different RPC use cases are separated into different RPC "services", and
> > apply an ACL on each service. For example, we can separate the current
> > 'MasterService' into 'TserverToMasterService' and
> 'ClientToMasterService',
> > and then register ACLs on each such service when we start the RPC server.
> >
> > Pros:
> > - when adding a new method, it "automatically" gets the right ACL without
> > having to remember to add special checks
> > - enforces some nice separation and grouping of RPCs
> >
> > Cons:
> > - doesn't handle special cases well, like 'DeleteTablet is usually called
> > by the master, but we also want to allow admins to call the same method
> for
> > use in manual recovery tools'
> > - would require a bunch of RPC methods to be moved around, and likely
> > break wire compatibility (I believe the RPC header currently specifies
> the
> > actual service name for each method call)
> >
> > *Option 2) Entirely ad-hoc authorization checks*
> >
> > At the top of each RPC method implementation, we can manually check the
> > authorization. Basically, imagine something like this:
> >
> > bool CheckTabletServerRequest(RpcContext* rpc) {
> >   // if 'rpc' is from a valid tablet server, return true
> >   // otherwise, respond to the RPC with an authz error and return false
> > }
> >
> > void MasterServiceImpl::TSHeartbeat(const TSHeartbeatRequestPB* req,
> >                                     TSHeartbeatResponsePB* resp,
> >                                     rpc::RpcContext* rpc) {
> >   if (!CheckTabletServerRequest(rpc)) return;
> > }
> >
> >
> > Pros:
> > - 100% flexible - easily allows for request-specific special cases
> > - relies on no "magic", it's obvious when you look at the code how the
> > authz check is applied.
> >
> > Cons:
> > - easy to forget to check authorization when adding new RPCs. I recall a
> > CVE or two in the past where we forgot to add the appropriate check in
> > various internal Hadoop RPCs.
> >
> >
> > *Option 3) Per-method authorization checks, declared in .proto file*
> >
> > In the .proto file describing a service, we can add a custom option like:
> >
> >   rpc TSHeartbeat(TSHeartbeatRequestPB) returns (TSHeartbeatResponsePB)
> {
> >     option (kudu.authz = "tablet-servers-only")
> >   };
> >
> > When we start the RPC system, we can bind these magic strings to actual
> > method implementations, and give an error if any RPC method either
> mentions
> > an invalid authz policy, or doesn't mention an authz policy at all.
> >
> > Pros:
> > - we can require that all methods specify a policy (even if they specify
> > 'none'), making it impossible to forget to authorize
> > - still retains flexibility: we can always create an exotic policy
> > implementation for an exotic method as necessary (or specify 'none' and
> do
> > the custom check from option 2)
> > - we can allow the same protobuf option to be specified as a service-wide
> > 'default' for the cases where an entire service has the same authz policy
> >
> > Cons:
> > - maybe a little more confusing to developers who have to go looking at
> > the .proto file to see which authorization policy is being applied to a
> > method?
> >
> > *Option X? Any other options I missed?*
> > - I only spent 30m or so thinking about this, so maybe I missed a better
> > option? Feedback appreciated.
> >
> >
> > I'm leaning towards option 3 here, but would appreciate feedback before I
> > go too far down that path.
> >
> > -Todd
> > --
> > Todd Lipcon
> > Software Engineer, Cloudera
> >
>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>

Re: Internal daemon-to-daemon authorization

Posted by Todd Lipcon <to...@cloudera.com>.
I whipped up a patch for something somewhat like option 3 here:
http://gerrit.cloudera.org:8080/4897

Please take a look if you're interested.

-Todd

On Mon, Oct 31, 2016 at 4:41 PM, Todd Lipcon <to...@cloudera.com> wrote:

> Hey folks,
>
> After https://gerrit.cloudera.org/#/c/4765/ finishes code review, the
> Kudu daemons will have support for Kerberized RPC. If we start the servers
> with --server_require_kerberos, then they'll require that all RPC clients
> authenticate themselves via Kerberos.
>
> Of course authentication isn't that useful on its own. I think the next
> step along this direction is to add basic "internal service" authorization:
>
> - the master should only allow tablet server service principals to
> heartbeat/register/etc
> - the tserver should only allow master service principals to create/delete
> tablets, etc. (with the possible extension to also allow 'root' users to do
> so)
>
> I think there are a couple patterns we could follow here to enforce the
> authorization
>
> *Option 1) Try to enforce ACLs on a per-RPC-service level*
>
> This is more or less the approach that Hadoop takes. We can make sure that
> different RPC use cases are separated into different RPC "services", and
> apply an ACL on each service. For example, we can separate the current
> 'MasterService' into 'TserverToMasterService' and 'ClientToMasterService',
> and then register ACLs on each such service when we start the RPC server.
>
> Pros:
> - when adding a new method, it "automatically" gets the right ACL without
> having to remember to add special checks
> - enforces some nice separation and grouping of RPCs
>
> Cons:
> - doesn't handle special cases well, like 'DeleteTablet is usually called
> by the master, but we also want to allow admins to call the same method for
> use in manual recovery tools'
> - would require a bunch of RPC methods to be moved around, and likely
> break wire compatibility (I believe the RPC header currently specifies the
> actual service name for each method call)
>
> *Option 2) Entirely ad-hoc authorization checks*
>
> At the top of each RPC method implementation, we can manually check the
> authorization. Basically, imagine something like this:
>
> bool CheckTabletServerRequest(RpcContext* rpc) {
>   // if 'rpc' is from a valid tablet server, return true
>   // otherwise, respond to the RPC with an authz error and return false
> }
>
> void MasterServiceImpl::TSHeartbeat(const TSHeartbeatRequestPB* req,
>                                     TSHeartbeatResponsePB* resp,
>                                     rpc::RpcContext* rpc) {
>   if (!CheckTabletServerRequest(rpc)) return;
> }
>
>
> Pros:
> - 100% flexible - easily allows for request-specific special cases
> - relies on no "magic", it's obvious when you look at the code how the
> authz check is applied.
>
> Cons:
> - easy to forget to check authorization when adding new RPCs. I recall a
> CVE or two in the past where we forgot to add the appropriate check in
> various internal Hadoop RPCs.
>
>
> *Option 3) Per-method authorization checks, declared in .proto file*
>
> In the .proto file describing a service, we can add a custom option like:
>
>   rpc TSHeartbeat(TSHeartbeatRequestPB) returns (TSHeartbeatResponsePB) {
>     option (kudu.authz = "tablet-servers-only")
>   };
>
> When we start the RPC system, we can bind these magic strings to actual
> method implementations, and give an error if any RPC method either mentions
> an invalid authz policy, or doesn't mention an authz policy at all.
>
> Pros:
> - we can require that all methods specify a policy (even if they specify
> 'none'), making it impossible to forget to authorize
> - still retains flexibility: we can always create an exotic policy
> implementation for an exotic method as necessary (or specify 'none' and do
> the custom check from option 2)
> - we can allow the same protobuf option to be specified as a service-wide
> 'default' for the cases where an entire service has the same authz policy
>
> Cons:
> - maybe a little more confusing to developers who have to go looking at
> the .proto file to see which authorization policy is being applied to a
> method?
>
> *Option X? Any other options I missed?*
> - I only spent 30m or so thinking about this, so maybe I missed a better
> option? Feedback appreciated.
>
>
> I'm leaning towards option 3 here, but would appreciate feedback before I
> go too far down that path.
>
> -Todd
> --
> Todd Lipcon
> Software Engineer, Cloudera
>



-- 
Todd Lipcon
Software Engineer, Cloudera