You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pulsar.apache.org by Matteo Merli <mm...@apache.org> on 2018/10/28 00:48:22 UTC

[DISCUSS] PIP 25: Token based authentication

https://github.com/apache/pulsar/wiki/PIP-25:-Token-based-authentication

------------------------------------------------------------------------

## Motivation

Pulsar has a [pluggable authentication mechanism](
http://pulsar.apache.org/docs/en/security-extending/#authentication)
that currently supports 3 auth providers.

 1. TLS certificates
 2. [Athenz](http://www.athenz.io/)
 3. [Basic access authentication](
https://en.wikipedia.org/wiki/Basic_access_authentication)

Each of them has few issues which could be summarized as:

 1. TLS
   * Requires to have multiple key files in both clients and brokers,
making it
     difficult to distribute the credentials.
   * The tools to generate keys are hard to use (OpenSSL)
   * It is hard to automate the creation of certificates
   * Manage the Certificate authority certificate is even harder

 2. Athenz
   * Requires additional service to run
   * Also targets authorization which we don't need since we have internal
implementation
     for that

 3. Basic auth
   * Very minimal password file based authentication, not really usable
when there
     are multiple clients/brokers

To address these issues, this proposal plans to add a new auth provider
that uses
[JSON Web Tokens](https://jwt.io/introduction/)
 ([RFC-7519](https://tools.ietf.org/html/rfc7519)).

 The compact representation of a signed JWT is a string that has three
 parts, each separated by a `.`:

```
 eyJhbGciOiJIUzI1NiJ9.eyJzdWIiOiJKb2UifQ.ipevRNuRP6HflG8cFKnmUPtypruRC4fb1DWtoLL62SY
```

The 3 parts are:
 1. **Header**: contains token infos such as the which algorithm is used for
    the signature
 2. **Payload**: Application specific information. Examples are: `subject`
    (user identifier or principal), `expiration`, etc.. Any kind of info
    can be attached when creating a token and used later during the
    validation
 3. **Signature**: Crypto signature that ensures authenticity of the
    token

The main properties of JSON Web Tokens are:
  1. Tokens are created and signed with a secret key
  2. The secret key can either be a single key or a pair of public/private
     keys. If 2 keys are used, the private key is used to generate the
     tokens and the public key can be distributed to all servers to
     perform validation of such tokens.
  3. Each server can validate a token without talking to any external
     service
  4. By adding information in the payload, the tokens can be scoped in
     many ways. eg:
       * Expire time
       * Limited scope by including reference to some resource
       * Restrict to some client IPs
  5. Revocations are not directly supported, but rather need to be
     implemented by maintaining a black-list of revoked tokens

Note: the TCP connection between client and broker is still expected
to be protected by TLS encryption. That is because the token shouldn't
be sent in clear text over the wire.

## Changes

### Token generation

We need to provide tools to allow an administrator to create the secret
key and the tokens for the users.

For example:

```shell
pulsar tokens create --key $SECRET_KEY --subject new-user-id
```

This will generate a new token and print it on console. This will be done
by administrator (or some automated service) and the token will be passed
to client.

Similarly, administrator will be able to create the secret key to
bootstrap the tokens generation:

```shell
pulsar tokens create-secret-key
```

### Client Side

From client library perspective, a new AuthenticationProvider will be
added that will support taking token and pass that directly to broker
on connection. Client plugin will not interpret the token in any form,
rather just treat it as an opaque string.

This will ensure that multiple tokens format can be used if required.

Additionally the `AuthenticationProvider` will allow the application
to pass a `Supplier<String>` to give the opportunity to fetch the
token from some config or secret store.

### Broker side

AuthenticationProvider in Broker will receive the token and validate
that with the secret key.

The secret key will be either provided in `broker.conf` or a special
class implementing `Supplier<String>` will be specified to fetch the
secret key from config or secret store.


--
Matteo Merli
<mm...@apache.org>

Re: [DISCUSS] PIP 25: Token based authentication

Posted by Ivan Kelly <iv...@apache.org>.
> Decoded Payload: {"sub":"my-test-subject"}

And sub is role? I guess you didn't use "role" since "sub" is a jwt builtin?

>> key? Where should the private key be stored? Is it PSK?
>>
>
> I would leave that outside the scope of this plugin.

It would be good to mention it in the PIP to give a little context.

>> How do we block compromised tokens?
>>
>
> I think the easiest approach is to keep 1 principal per each token. With
> this, to block a compromised token, we
> would just have to remove ACLs given to the principal associated with the
> token.
>
> Downside of this approach is that if that principal was granted permissions
> on multiple namespaces/topics, we'd have
> to remove from each of them.
>
> The other approach, which we should consider in future, is to add support
> of a revocation list.
> The implementation would be essentially a hash-set with notifications to
> all brokers. Each broker will be notified when a
> new principal is revoked and will force disconnection of any connected
> producer/consumer using that token.

1 principal per token is clunky for the reason you mentioned. It also
requires we have an index somewhere mapping principal to resource. We
should get onto the revokation list stuff ASAP, but that requires
system topics be well defined :/

-Ivan

Re: [DISCUSS] PIP 25: Token based authentication

Posted by Matteo Merli <mm...@apache.org>.
On Tue, Oct 30, 2018 at 7:02 AM Ivan Kelly <iv...@apache.org> wrote:

> Looks like a great scheme, but I'd like some more concrete details
> about how you see this interacting with Pulsar.
>

Replies inline and then I'll add to the wiki as well.


>
> A couple of questions:
>
> How will the token be passed in HTTP? As a header?
>

Yes, the client will pass the token in a header, such as `X-Pulsar-Auth` or
`Authorization`. Still to be determined the header name.


> What does the concrete payload look like?
>

The token will look like :

eyJhbGciOiJIUzI1NiJ9.eyJzdWIiOiJKb2UifQ.ipevRNuRP6HflG8cFKnmUPtypruRC4fb1DWtoLL62SY

This string is composed of 3 parts, separated by '.' and each separately
encoded in Base64 (url compatible).
The 3 parts are:
 * Header
 * Payload
 * Signature

Header and payload are actually JSON string. Examples:
Decoded Header: {"alg":"RS256"}
Decoded Payload: {"sub":"my-test-subject"}


> How is the server configured to accept this? Is there a public/private
>

There are 2 options:
 1. Use a single "secret key" for generating and validating tokens
 2. Use public/private keys:
       * Private key is used to generate tokens
       * Public key is used to validate tokens.
     Pulsar broker would only need to have the public key in this case


> key? Where should the private key be stored? Is it PSK?
>

I would leave that outside the scope of this plugin. The private key is
only needed by
the system admin to generate new tokens (perhaps through some automated
service, which again
is not in the scope of this proposal).

bin/pulsar tokens create-secret-key

Will write the new generated key on stdout. User can store the key where
deemed appropriate.

When creating a new token, there are several ways to pass the secret key to
the tool. eg:

 1. Read from a file: `bin/pulsar tokens create -f /path/to/secret.key
--subject my-subject`
 2. Read from stdin: `bin/pulsar tokens create-secret-key --stdin --subject
my-subject`
 3. Read from env: `export SECRET_KEY=$(read my key)`; bin/pulsar tokens
create-secret-key --stdin --subject my-subject`


> How do we block compromised tokens?
>

I think the easiest approach is to keep 1 principal per each token. With
this, to block a compromised token, we
would just have to remove ACLs given to the principal associated with the
token.

Downside of this approach is that if that principal was granted permissions
on multiple namespaces/topics, we'd have
to remove from each of them.

The other approach, which we should consider in future, is to add support
of a revocation list.
The implementation would be essentially a hash-set with notifications to
all brokers. Each broker will be notified when a
new principal is revoked and will force disconnection of any connected
producer/consumer using that token.



-- 
Matteo Merli
<mm...@apache.org>

Re: [DISCUSS] PIP 25: Token based authentication

Posted by Ivan Kelly <iv...@apache.org>.
Looks like a great scheme, but I'd like some more concrete details
about how you see this interacting with Pulsar.


A couple of questions:

How will the token be passed in HTTP? As a header?
What does the concrete payload look like?
How is the server configured to accept this? Is there a public/private
key? Where should the private key be stored? Is it PSK?
How do we block compromised tokens?

-Ivan

On Sun, Oct 28, 2018 at 2:48 AM, Matteo Merli <mm...@apache.org> wrote:
> https://github.com/apache/pulsar/wiki/PIP-25:-Token-based-authentication
>
> ------------------------------------------------------------------------
>
> ## Motivation
>
> Pulsar has a [pluggable authentication mechanism](
> http://pulsar.apache.org/docs/en/security-extending/#authentication)
> that currently supports 3 auth providers.
>
>  1. TLS certificates
>  2. [Athenz](http://www.athenz.io/)
>  3. [Basic access authentication](
> https://en.wikipedia.org/wiki/Basic_access_authentication)
>
> Each of them has few issues which could be summarized as:
>
>  1. TLS
>    * Requires to have multiple key files in both clients and brokers,
> making it
>      difficult to distribute the credentials.
>    * The tools to generate keys are hard to use (OpenSSL)
>    * It is hard to automate the creation of certificates
>    * Manage the Certificate authority certificate is even harder
>
>  2. Athenz
>    * Requires additional service to run
>    * Also targets authorization which we don't need since we have internal
> implementation
>      for that
>
>  3. Basic auth
>    * Very minimal password file based authentication, not really usable
> when there
>      are multiple clients/brokers
>
> To address these issues, this proposal plans to add a new auth provider
> that uses
> [JSON Web Tokens](https://jwt.io/introduction/)
>  ([RFC-7519](https://tools.ietf.org/html/rfc7519)).
>
>  The compact representation of a signed JWT is a string that has three
>  parts, each separated by a `.`:
>
> ```
>  eyJhbGciOiJIUzI1NiJ9.eyJzdWIiOiJKb2UifQ.ipevRNuRP6HflG8cFKnmUPtypruRC4fb1DWtoLL62SY
> ```
>
> The 3 parts are:
>  1. **Header**: contains token infos such as the which algorithm is used for
>     the signature
>  2. **Payload**: Application specific information. Examples are: `subject`
>     (user identifier or principal), `expiration`, etc.. Any kind of info
>     can be attached when creating a token and used later during the
>     validation
>  3. **Signature**: Crypto signature that ensures authenticity of the
>     token
>
> The main properties of JSON Web Tokens are:
>   1. Tokens are created and signed with a secret key
>   2. The secret key can either be a single key or a pair of public/private
>      keys. If 2 keys are used, the private key is used to generate the
>      tokens and the public key can be distributed to all servers to
>      perform validation of such tokens.
>   3. Each server can validate a token without talking to any external
>      service
>   4. By adding information in the payload, the tokens can be scoped in
>      many ways. eg:
>        * Expire time
>        * Limited scope by including reference to some resource
>        * Restrict to some client IPs
>   5. Revocations are not directly supported, but rather need to be
>      implemented by maintaining a black-list of revoked tokens
>
> Note: the TCP connection between client and broker is still expected
> to be protected by TLS encryption. That is because the token shouldn't
> be sent in clear text over the wire.
>
> ## Changes
>
> ### Token generation
>
> We need to provide tools to allow an administrator to create the secret
> key and the tokens for the users.
>
> For example:
>
> ```shell
> pulsar tokens create --key $SECRET_KEY --subject new-user-id
> ```
>
> This will generate a new token and print it on console. This will be done
> by administrator (or some automated service) and the token will be passed
> to client.
>
> Similarly, administrator will be able to create the secret key to
> bootstrap the tokens generation:
>
> ```shell
> pulsar tokens create-secret-key
> ```
>
> ### Client Side
>
> From client library perspective, a new AuthenticationProvider will be
> added that will support taking token and pass that directly to broker
> on connection. Client plugin will not interpret the token in any form,
> rather just treat it as an opaque string.
>
> This will ensure that multiple tokens format can be used if required.
>
> Additionally the `AuthenticationProvider` will allow the application
> to pass a `Supplier<String>` to give the opportunity to fetch the
> token from some config or secret store.
>
> ### Broker side
>
> AuthenticationProvider in Broker will receive the token and validate
> that with the secret key.
>
> The secret key will be either provided in `broker.conf` or a special
> class implementing `Supplier<String>` will be specified to fetch the
> secret key from config or secret store.
>
>
> --
> Matteo Merli
> <mm...@apache.org>