You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pulsar.apache.org by Yubiao Feng <yu...@streamnative.io.INVALID> on 2023/03/28 11:52:06 UTC

[DISCUSS] Cherry-pick #15121 into branch-2.10 to solve the issue sasl authentication failure

Hi community

### Summary
The Admin client (`pulsar-admin`) and Java Client (PulsarAdmin) will throw
Unauthorized Ex in both scenarios:
- If there have more than one broker in a cluster( see issue 1 below ).
- If authentication is enabled for both Pulsar-Proxy and Pulsar-Broker( see
issue 2 below),

```
bin/pulsar-admin topics stats persistent://public/default/tp1
2023-03-28T07:30:58,453+0000 [main] INFO
org.apache.pulsar.client.impl.auth.AuthenticationSasl - JAAS loginContext
is: PulsarAdmin.
2023-03-28T07:30:58,583+0000 [main] INFO
org.apache.pulsar.common.sasl.JAASCredentialsContainer - successfully
logged in.
2023-03-28T07:30:58,587+0000 [pulsar-tgt-refresh-thread] INFO
org.apache.pulsar.common.sasl.TGTRefreshThread - TGT refresh thread started.
2023-03-28T07:30:58,612+0000 [pulsar-tgt-refresh-thread] INFO
org.apache.pulsar.common.sasl.TGTRefreshThread - Client principal is "
pulsar-admin@SN.IO".
2023-03-28T07:30:58,613+0000 [pulsar-tgt-refresh-thread] INFO
org.apache.pulsar.common.sasl.TGTRefreshThread - Server principal is
"krbtgt/SN.IO@SN.IO".
2023-03-28T07:30:58,617+0000 [pulsar-tgt-refresh-thread] INFO
org.apache.pulsar.common.sasl.TGTRefreshThread - TGT valid starting at:
    Tue Mar 28 07:30:58 UTC 2023
2023-03-28T07:30:58,617+0000 [pulsar-tgt-refresh-thread] INFO
org.apache.pulsar.common.sasl.TGTRefreshThread - TGT expires:
    Wed Mar 29 07:30:58 UTC 2023
2023-03-28T07:30:58,617+0000 [pulsar-tgt-refresh-thread] INFO
org.apache.pulsar.common.sasl.TGTRefreshThread - TGT refresh sleeping
until: Wed Mar 29 03:12:29 UTC 2023
2023-03-28T07:30:59,861+0000 [main] INFO
org.apache.pulsar.client.impl.auth.PulsarSaslClient - Using
JAAS/SASL/GSSAPI auth to connect to server Principal broker/pulsar03,
HTTP 401 Unauthorized
Reason: HTTP 401 Unauthorized
```

And I want to cherry-pick https://github.com/apache/pulsar/pull/15121 into
branch-2.10 to fix it.

### Background
When using Kerberos for authentication, Pulsar works like this:
- client: init ticket
- request to broker
- broker identifies the client (Broker can confirm the ticket is valid by
Kerberos)
- sends a token(we call it sasl_role_token) to the client ( at this moment,
the session is successfully created )
- then the client will be authenticated through sasl_role_token, do not use
Kerberos anymore.

The `sasl_role_token` is generated by this logic: `Sha512(saslRoleName,
${secret})`, we call the `secret` sasl_sign_secret.
In version `2.10.x`, the variable `secret` is a random string initialized
when the broker starts.

### Issue 1
If a cluster includes two brokers, and a topic `public/default/tp1` is
owned by broker-0. We will get an error when we call `pulsar-admin topics
stats public/default/tp1` to broker-1.

The whole process goes like this:
- client succeeds in authentication and gets a token from broker-1
- broker-1 tells the client to redirect to broker-0
- client request to broker-0 carries the sasl_role_token generated by
broker-1
- broker-0 can not decode the sasl_role_token, because it has differ secret
of broker-1, and responses 401

### Issue 2
After authentication is enabled for both Pulsar-Proxy and Pulsar-Broker,
the error occurs as follows
- client succeeds in authentication and gets a token from Pulsar Proxy
- proxy forwards the request to broker
- the broker can not decode the `sasl_role_token`, because it has differed
secret of Pulsar Proxy, and responses 401

### solutions
There have two solutions to solve this issue:

Solution 1
- The client saves different tokens for different servers(e.g. ["broker-0",
"broker-1", "pulsar-proxy"]) so servers will receive the tokens issued by
each other, then we can fix Issue 1.
- Proxy and Broker do not enable authentication simultaneously, then we can
fix Issue 2.

Solution 2
- Make `sasl_sign_secret` configurable. Users can configure this variable
to the same value, then multi servers can decode every
`sasl_role_token.`  PR #15121 does this.

I'd prefer Solution 2 because it is already in the master branch, so I want
to cherry-pick #15121 into branch-2.10.

### Forward Compatibility
In PR #15121, the config `sasl_sign_secret` is a new item in config files.
Since it is required, users will get a system error if does not set it. To
ensure forward compatibility, we can make this variable optional in
branch-2.10


Thanks
Yubiao Feng

Re: [DISCUSS] Cherry-pick #15121 into branch-2.10 to solve the issue sasl authentication failure

Posted by ma...@gmail.com.
I agree with cherry-picking PR 15121 to branch-2.10 and keep compatibility.


Best,
Mattison

On Mar 28, 2023, 19:52 +0800, Yubiao Feng <yu...@streamnative.io.invalid>, wrote:
> Hi community
>
> ### Summary
> The Admin client (`pulsar-admin`) and Java Client (PulsarAdmin) will throw
> Unauthorized Ex in both scenarios:
> - If there have more than one broker in a cluster( see issue 1 below ).
> - If authentication is enabled for both Pulsar-Proxy and Pulsar-Broker( see
> issue 2 below),
>
> ```
> bin/pulsar-admin topics stats persistent://public/default/tp1
> 2023-03-28T07:30:58,453+0000 [main] INFO
> org.apache.pulsar.client.impl.auth.AuthenticationSasl - JAAS loginContext
> is: PulsarAdmin.
> 2023-03-28T07:30:58,583+0000 [main] INFO
> org.apache.pulsar.common.sasl.JAASCredentialsContainer - successfully
> logged in.
> 2023-03-28T07:30:58,587+0000 [pulsar-tgt-refresh-thread] INFO
> org.apache.pulsar.common.sasl.TGTRefreshThread - TGT refresh thread started.
> 2023-03-28T07:30:58,612+0000 [pulsar-tgt-refresh-thread] INFO
> org.apache.pulsar.common.sasl.TGTRefreshThread - Client principal is "
> pulsar-admin@SN.IO".
> 2023-03-28T07:30:58,613+0000 [pulsar-tgt-refresh-thread] INFO
> org.apache.pulsar.common.sasl.TGTRefreshThread - Server principal is
> "krbtgt/SN.IO@SN.IO".
> 2023-03-28T07:30:58,617+0000 [pulsar-tgt-refresh-thread] INFO
> org.apache.pulsar.common.sasl.TGTRefreshThread - TGT valid starting at:
> Tue Mar 28 07:30:58 UTC 2023
> 2023-03-28T07:30:58,617+0000 [pulsar-tgt-refresh-thread] INFO
> org.apache.pulsar.common.sasl.TGTRefreshThread - TGT expires:
> Wed Mar 29 07:30:58 UTC 2023
> 2023-03-28T07:30:58,617+0000 [pulsar-tgt-refresh-thread] INFO
> org.apache.pulsar.common.sasl.TGTRefreshThread - TGT refresh sleeping
> until: Wed Mar 29 03:12:29 UTC 2023
> 2023-03-28T07:30:59,861+0000 [main] INFO
> org.apache.pulsar.client.impl.auth.PulsarSaslClient - Using
> JAAS/SASL/GSSAPI auth to connect to server Principal broker/pulsar03,
> HTTP 401 Unauthorized
> Reason: HTTP 401 Unauthorized
> ```
>
> And I want to cherry-pick https://github.com/apache/pulsar/pull/15121 into
> branch-2.10 to fix it.
>
> ### Background
> When using Kerberos for authentication, Pulsar works like this:
> - client: init ticket
> - request to broker
> - broker identifies the client (Broker can confirm the ticket is valid by
> Kerberos)
> - sends a token(we call it sasl_role_token) to the client ( at this moment,
> the session is successfully created )
> - then the client will be authenticated through sasl_role_token, do not use
> Kerberos anymore.
>
> The `sasl_role_token` is generated by this logic: `Sha512(saslRoleName,
> ${secret})`, we call the `secret` sasl_sign_secret.
> In version `2.10.x`, the variable `secret` is a random string initialized
> when the broker starts.
>
> ### Issue 1
> If a cluster includes two brokers, and a topic `public/default/tp1` is
> owned by broker-0. We will get an error when we call `pulsar-admin topics
> stats public/default/tp1` to broker-1.
>
> The whole process goes like this:
> - client succeeds in authentication and gets a token from broker-1
> - broker-1 tells the client to redirect to broker-0
> - client request to broker-0 carries the sasl_role_token generated by
> broker-1
> - broker-0 can not decode the sasl_role_token, because it has differ secret
> of broker-1, and responses 401
>
> ### Issue 2
> After authentication is enabled for both Pulsar-Proxy and Pulsar-Broker,
> the error occurs as follows
> - client succeeds in authentication and gets a token from Pulsar Proxy
> - proxy forwards the request to broker
> - the broker can not decode the `sasl_role_token`, because it has differed
> secret of Pulsar Proxy, and responses 401
>
> ### solutions
> There have two solutions to solve this issue:
>
> Solution 1
> - The client saves different tokens for different servers(e.g. ["broker-0",
> "broker-1", "pulsar-proxy"]) so servers will receive the tokens issued by
> each other, then we can fix Issue 1.
> - Proxy and Broker do not enable authentication simultaneously, then we can
> fix Issue 2.
>
> Solution 2
> - Make `sasl_sign_secret` configurable. Users can configure this variable
> to the same value, then multi servers can decode every
> `sasl_role_token.` PR #15121 does this.
>
> I'd prefer Solution 2 because it is already in the master branch, so I want
> to cherry-pick #15121 into branch-2.10.
>
> ### Forward Compatibility
> In PR #15121, the config `sasl_sign_secret` is a new item in config files.
> Since it is required, users will get a system error if does not set it. To
> ensure forward compatibility, we can make this variable optional in
> branch-2.10
>
>
> Thanks
> Yubiao Feng

Re: [DISCUSS] Cherry-pick #15121 into branch-2.10 to solve the issue sasl authentication failure

Posted by Yubiao Feng <yu...@streamnative.io.INVALID>.
There is no objection, and I will cherry-pick #15121 into branch-2.10 today

Thanks
Yubiao Feng

On Tue, Mar 28, 2023 at 7:52 PM Yubiao Feng <yu...@streamnative.io>
wrote:

> Hi community
>
> ### Summary
> The Admin client (`pulsar-admin`) and Java Client (PulsarAdmin) will throw
> Unauthorized Ex in both scenarios:
> - If there have more than one broker in a cluster( see issue 1 below ).
> - If authentication is enabled for both Pulsar-Proxy and Pulsar-Broker(
> see issue 2 below),
>
> ```
> bin/pulsar-admin topics stats persistent://public/default/tp1
> 2023-03-28T07:30:58,453+0000 [main] INFO
> org.apache.pulsar.client.impl.auth.AuthenticationSasl - JAAS loginContext
> is: PulsarAdmin.
> 2023-03-28T07:30:58,583+0000 [main] INFO
> org.apache.pulsar.common.sasl.JAASCredentialsContainer - successfully
> logged in.
> 2023-03-28T07:30:58,587+0000 [pulsar-tgt-refresh-thread] INFO
> org.apache.pulsar.common.sasl.TGTRefreshThread - TGT refresh thread started.
> 2023-03-28T07:30:58,612+0000 [pulsar-tgt-refresh-thread] INFO
> org.apache.pulsar.common.sasl.TGTRefreshThread - Client principal is "
> pulsar-admin@SN.IO".
> 2023-03-28T07:30:58,613+0000 [pulsar-tgt-refresh-thread] INFO
> org.apache.pulsar.common.sasl.TGTRefreshThread - Server principal is
> "krbtgt/SN.IO@SN.IO".
> 2023-03-28T07:30:58,617+0000 [pulsar-tgt-refresh-thread] INFO
> org.apache.pulsar.common.sasl.TGTRefreshThread - TGT valid starting at:
>     Tue Mar 28 07:30:58 UTC 2023
> 2023-03-28T07:30:58,617+0000 [pulsar-tgt-refresh-thread] INFO
> org.apache.pulsar.common.sasl.TGTRefreshThread - TGT expires:
>     Wed Mar 29 07:30:58 UTC 2023
> 2023-03-28T07:30:58,617+0000 [pulsar-tgt-refresh-thread] INFO
> org.apache.pulsar.common.sasl.TGTRefreshThread - TGT refresh sleeping
> until: Wed Mar 29 03:12:29 UTC 2023
> 2023-03-28T07:30:59,861+0000 [main] INFO
> org.apache.pulsar.client.impl.auth.PulsarSaslClient - Using
> JAAS/SASL/GSSAPI auth to connect to server Principal broker/pulsar03,
> HTTP 401 Unauthorized
> Reason: HTTP 401 Unauthorized
> ```
>
> And I want to cherry-pick https://github.com/apache/pulsar/pull/15121
> into branch-2.10 to fix it.
>
> ### Background
> When using Kerberos for authentication, Pulsar works like this:
> - client: init ticket
> - request to broker
> - broker identifies the client (Broker can confirm the ticket is valid by
> Kerberos)
> - sends a token(we call it sasl_role_token) to the client ( at this
> moment, the session is successfully created )
> - then the client will be authenticated through sasl_role_token, do not
> use Kerberos anymore.
>
> The `sasl_role_token` is generated by this logic: `Sha512(saslRoleName,
> ${secret})`, we call the `secret` sasl_sign_secret.
> In version `2.10.x`, the variable `secret` is a random string initialized
> when the broker starts.
>
> ### Issue 1
> If a cluster includes two brokers, and a topic `public/default/tp1` is
> owned by broker-0. We will get an error when we call `pulsar-admin topics
> stats public/default/tp1` to broker-1.
>
> The whole process goes like this:
> - client succeeds in authentication and gets a token from broker-1
> - broker-1 tells the client to redirect to broker-0
> - client request to broker-0 carries the sasl_role_token generated by
> broker-1
> - broker-0 can not decode the sasl_role_token, because it has differ
> secret of broker-1, and responses 401
>
> ### Issue 2
> After authentication is enabled for both Pulsar-Proxy and Pulsar-Broker,
> the error occurs as follows
> - client succeeds in authentication and gets a token from Pulsar Proxy
> - proxy forwards the request to broker
> - the broker can not decode the `sasl_role_token`, because it has differed
> secret of Pulsar Proxy, and responses 401
>
> ### solutions
> There have two solutions to solve this issue:
>
> Solution 1
> - The client saves different tokens for different servers(e.g.
> ["broker-0", "broker-1", "pulsar-proxy"]) so servers will receive the
> tokens issued by each other, then we can fix Issue 1.
> - Proxy and Broker do not enable authentication simultaneously, then we
> can fix Issue 2.
>
> Solution 2
> - Make `sasl_sign_secret` configurable. Users can configure this variable
> to the same value, then multi servers can decode every
> `sasl_role_token.`  PR #15121 does this.
>
> I'd prefer Solution 2 because it is already in the master branch, so I
> want to cherry-pick #15121 into branch-2.10.
>
> ### Forward Compatibility
> In PR #15121, the config `sasl_sign_secret` is a new item in config files.
> Since it is required, users will get a system error if does not set it. To
> ensure forward compatibility, we can make this variable optional in
> branch-2.10
>
>
> Thanks
> Yubiao Feng
>

Re: [DISCUSS] Cherry-pick #15121 into branch-2.10 to solve the issue sasl authentication failure

Posted by 丛搏 <bo...@apache.org>.
+1 (Solution 2)

Thanks,
Bo

PengHui Li <pe...@apache.org> 于2023年3月29日周三 09:47写道:
>
> Looks good to me to make it optional in branch-2.10 since we don't want to
> introduce any break behaviors in the subsequent patch releases.
>
> Thanks,
> Penghui
>
> On Tue, Mar 28, 2023 at 9:39 PM Dezhi Liu <de...@apache.org> wrote:
>
> > I agree with cherry-picking PR 15121 to branch-2.10 and keep compatibility.
> >
> >
> > Best,
> > Dezhi
> >
> > On 2023/03/28 11:52:06 Yubiao Feng wrote:
> > > Hi community
> > >
> > > ### Summary
> > > The Admin client (`pulsar-admin`) and Java Client (PulsarAdmin) will
> > throw
> > > Unauthorized Ex in both scenarios:
> > > - If there have more than one broker in a cluster( see issue 1 below ).
> > > - If authentication is enabled for both Pulsar-Proxy and Pulsar-Broker(
> > see
> > > issue 2 below),
> > >
> > > ```
> > > bin/pulsar-admin topics stats persistent://public/default/tp1
> > > 2023-03-28T07:30:58,453+0000 [main] INFO
> > > org.apache.pulsar.client.impl.auth.AuthenticationSasl - JAAS loginContext
> > > is: PulsarAdmin.
> > > 2023-03-28T07:30:58,583+0000 [main] INFO
> > > org.apache.pulsar.common.sasl.JAASCredentialsContainer - successfully
> > > logged in.
> > > 2023-03-28T07:30:58,587+0000 [pulsar-tgt-refresh-thread] INFO
> > > org.apache.pulsar.common.sasl.TGTRefreshThread - TGT refresh thread
> > started.
> > > 2023-03-28T07:30:58,612+0000 [pulsar-tgt-refresh-thread] INFO
> > > org.apache.pulsar.common.sasl.TGTRefreshThread - Client principal is "
> > > pulsar-admin@SN.IO".
> > > 2023-03-28T07:30:58,613+0000 [pulsar-tgt-refresh-thread] INFO
> > > org.apache.pulsar.common.sasl.TGTRefreshThread - Server principal is
> > > "krbtgt/SN.IO@SN.IO".
> > > 2023-03-28T07:30:58,617+0000 [pulsar-tgt-refresh-thread] INFO
> > > org.apache.pulsar.common.sasl.TGTRefreshThread - TGT valid starting at:
> > >     Tue Mar 28 07:30:58 UTC 2023
> > > 2023-03-28T07:30:58,617+0000 [pulsar-tgt-refresh-thread] INFO
> > > org.apache.pulsar.common.sasl.TGTRefreshThread - TGT expires:
> > >     Wed Mar 29 07:30:58 UTC 2023
> > > 2023-03-28T07:30:58,617+0000 [pulsar-tgt-refresh-thread] INFO
> > > org.apache.pulsar.common.sasl.TGTRefreshThread - TGT refresh sleeping
> > > until: Wed Mar 29 03:12:29 UTC 2023
> > > 2023-03-28T07:30:59,861+0000 [main] INFO
> > > org.apache.pulsar.client.impl.auth.PulsarSaslClient - Using
> > > JAAS/SASL/GSSAPI auth to connect to server Principal broker/pulsar03,
> > > HTTP 401 Unauthorized
> > > Reason: HTTP 401 Unauthorized
> > > ```
> > >
> > > And I want to cherry-pick https://github.com/apache/pulsar/pull/15121
> > into
> > > branch-2.10 to fix it.
> > >
> > > ### Background
> > > When using Kerberos for authentication, Pulsar works like this:
> > > - client: init ticket
> > > - request to broker
> > > - broker identifies the client (Broker can confirm the ticket is valid by
> > > Kerberos)
> > > - sends a token(we call it sasl_role_token) to the client ( at this
> > moment,
> > > the session is successfully created )
> > > - then the client will be authenticated through sasl_role_token, do not
> > use
> > > Kerberos anymore.
> > >
> > > The `sasl_role_token` is generated by this logic: `Sha512(saslRoleName,
> > > ${secret})`, we call the `secret` sasl_sign_secret.
> > > In version `2.10.x`, the variable `secret` is a random string initialized
> > > when the broker starts.
> > >
> > > ### Issue 1
> > > If a cluster includes two brokers, and a topic `public/default/tp1` is
> > > owned by broker-0. We will get an error when we call `pulsar-admin topics
> > > stats public/default/tp1` to broker-1.
> > >
> > > The whole process goes like this:
> > > - client succeeds in authentication and gets a token from broker-1
> > > - broker-1 tells the client to redirect to broker-0
> > > - client request to broker-0 carries the sasl_role_token generated by
> > > broker-1
> > > - broker-0 can not decode the sasl_role_token, because it has differ
> > secret
> > > of broker-1, and responses 401
> > >
> > > ### Issue 2
> > > After authentication is enabled for both Pulsar-Proxy and Pulsar-Broker,
> > > the error occurs as follows
> > > - client succeeds in authentication and gets a token from Pulsar Proxy
> > > - proxy forwards the request to broker
> > > - the broker can not decode the `sasl_role_token`, because it has
> > differed
> > > secret of Pulsar Proxy, and responses 401
> > >
> > > ### solutions
> > > There have two solutions to solve this issue:
> > >
> > > Solution 1
> > > - The client saves different tokens for different servers(e.g.
> > ["broker-0",
> > > "broker-1", "pulsar-proxy"]) so servers will receive the tokens issued by
> > > each other, then we can fix Issue 1.
> > > - Proxy and Broker do not enable authentication simultaneously, then we
> > can
> > > fix Issue 2.
> > >
> > > Solution 2
> > > - Make `sasl_sign_secret` configurable. Users can configure this variable
> > > to the same value, then multi servers can decode every
> > > `sasl_role_token.`  PR #15121 does this.
> > >
> > > I'd prefer Solution 2 because it is already in the master branch, so I
> > want
> > > to cherry-pick #15121 into branch-2.10.
> > >
> > > ### Forward Compatibility
> > > In PR #15121, the config `sasl_sign_secret` is a new item in config
> > files.
> > > Since it is required, users will get a system error if does not set it.
> > To
> > > ensure forward compatibility, we can make this variable optional in
> > > branch-2.10
> > >
> > >
> > > Thanks
> > > Yubiao Feng
> > >
> >

Re: [DISCUSS] Cherry-pick #15121 into branch-2.10 to solve the issue sasl authentication failure

Posted by PengHui Li <pe...@apache.org>.
Looks good to me to make it optional in branch-2.10 since we don't want to
introduce any break behaviors in the subsequent patch releases.

Thanks,
Penghui

On Tue, Mar 28, 2023 at 9:39 PM Dezhi Liu <de...@apache.org> wrote:

> I agree with cherry-picking PR 15121 to branch-2.10 and keep compatibility.
>
>
> Best,
> Dezhi
>
> On 2023/03/28 11:52:06 Yubiao Feng wrote:
> > Hi community
> >
> > ### Summary
> > The Admin client (`pulsar-admin`) and Java Client (PulsarAdmin) will
> throw
> > Unauthorized Ex in both scenarios:
> > - If there have more than one broker in a cluster( see issue 1 below ).
> > - If authentication is enabled for both Pulsar-Proxy and Pulsar-Broker(
> see
> > issue 2 below),
> >
> > ```
> > bin/pulsar-admin topics stats persistent://public/default/tp1
> > 2023-03-28T07:30:58,453+0000 [main] INFO
> > org.apache.pulsar.client.impl.auth.AuthenticationSasl - JAAS loginContext
> > is: PulsarAdmin.
> > 2023-03-28T07:30:58,583+0000 [main] INFO
> > org.apache.pulsar.common.sasl.JAASCredentialsContainer - successfully
> > logged in.
> > 2023-03-28T07:30:58,587+0000 [pulsar-tgt-refresh-thread] INFO
> > org.apache.pulsar.common.sasl.TGTRefreshThread - TGT refresh thread
> started.
> > 2023-03-28T07:30:58,612+0000 [pulsar-tgt-refresh-thread] INFO
> > org.apache.pulsar.common.sasl.TGTRefreshThread - Client principal is "
> > pulsar-admin@SN.IO".
> > 2023-03-28T07:30:58,613+0000 [pulsar-tgt-refresh-thread] INFO
> > org.apache.pulsar.common.sasl.TGTRefreshThread - Server principal is
> > "krbtgt/SN.IO@SN.IO".
> > 2023-03-28T07:30:58,617+0000 [pulsar-tgt-refresh-thread] INFO
> > org.apache.pulsar.common.sasl.TGTRefreshThread - TGT valid starting at:
> >     Tue Mar 28 07:30:58 UTC 2023
> > 2023-03-28T07:30:58,617+0000 [pulsar-tgt-refresh-thread] INFO
> > org.apache.pulsar.common.sasl.TGTRefreshThread - TGT expires:
> >     Wed Mar 29 07:30:58 UTC 2023
> > 2023-03-28T07:30:58,617+0000 [pulsar-tgt-refresh-thread] INFO
> > org.apache.pulsar.common.sasl.TGTRefreshThread - TGT refresh sleeping
> > until: Wed Mar 29 03:12:29 UTC 2023
> > 2023-03-28T07:30:59,861+0000 [main] INFO
> > org.apache.pulsar.client.impl.auth.PulsarSaslClient - Using
> > JAAS/SASL/GSSAPI auth to connect to server Principal broker/pulsar03,
> > HTTP 401 Unauthorized
> > Reason: HTTP 401 Unauthorized
> > ```
> >
> > And I want to cherry-pick https://github.com/apache/pulsar/pull/15121
> into
> > branch-2.10 to fix it.
> >
> > ### Background
> > When using Kerberos for authentication, Pulsar works like this:
> > - client: init ticket
> > - request to broker
> > - broker identifies the client (Broker can confirm the ticket is valid by
> > Kerberos)
> > - sends a token(we call it sasl_role_token) to the client ( at this
> moment,
> > the session is successfully created )
> > - then the client will be authenticated through sasl_role_token, do not
> use
> > Kerberos anymore.
> >
> > The `sasl_role_token` is generated by this logic: `Sha512(saslRoleName,
> > ${secret})`, we call the `secret` sasl_sign_secret.
> > In version `2.10.x`, the variable `secret` is a random string initialized
> > when the broker starts.
> >
> > ### Issue 1
> > If a cluster includes two brokers, and a topic `public/default/tp1` is
> > owned by broker-0. We will get an error when we call `pulsar-admin topics
> > stats public/default/tp1` to broker-1.
> >
> > The whole process goes like this:
> > - client succeeds in authentication and gets a token from broker-1
> > - broker-1 tells the client to redirect to broker-0
> > - client request to broker-0 carries the sasl_role_token generated by
> > broker-1
> > - broker-0 can not decode the sasl_role_token, because it has differ
> secret
> > of broker-1, and responses 401
> >
> > ### Issue 2
> > After authentication is enabled for both Pulsar-Proxy and Pulsar-Broker,
> > the error occurs as follows
> > - client succeeds in authentication and gets a token from Pulsar Proxy
> > - proxy forwards the request to broker
> > - the broker can not decode the `sasl_role_token`, because it has
> differed
> > secret of Pulsar Proxy, and responses 401
> >
> > ### solutions
> > There have two solutions to solve this issue:
> >
> > Solution 1
> > - The client saves different tokens for different servers(e.g.
> ["broker-0",
> > "broker-1", "pulsar-proxy"]) so servers will receive the tokens issued by
> > each other, then we can fix Issue 1.
> > - Proxy and Broker do not enable authentication simultaneously, then we
> can
> > fix Issue 2.
> >
> > Solution 2
> > - Make `sasl_sign_secret` configurable. Users can configure this variable
> > to the same value, then multi servers can decode every
> > `sasl_role_token.`  PR #15121 does this.
> >
> > I'd prefer Solution 2 because it is already in the master branch, so I
> want
> > to cherry-pick #15121 into branch-2.10.
> >
> > ### Forward Compatibility
> > In PR #15121, the config `sasl_sign_secret` is a new item in config
> files.
> > Since it is required, users will get a system error if does not set it.
> To
> > ensure forward compatibility, we can make this variable optional in
> > branch-2.10
> >
> >
> > Thanks
> > Yubiao Feng
> >
>

Re: [DISCUSS] Cherry-pick #15121 into branch-2.10 to solve the issue sasl authentication failure

Posted by Dezhi Liu <de...@apache.org>.
I agree with cherry-picking PR 15121 to branch-2.10 and keep compatibility.


Best,
Dezhi

On 2023/03/28 11:52:06 Yubiao Feng wrote:
> Hi community
> 
> ### Summary
> The Admin client (`pulsar-admin`) and Java Client (PulsarAdmin) will throw
> Unauthorized Ex in both scenarios:
> - If there have more than one broker in a cluster( see issue 1 below ).
> - If authentication is enabled for both Pulsar-Proxy and Pulsar-Broker( see
> issue 2 below),
> 
> ```
> bin/pulsar-admin topics stats persistent://public/default/tp1
> 2023-03-28T07:30:58,453+0000 [main] INFO
> org.apache.pulsar.client.impl.auth.AuthenticationSasl - JAAS loginContext
> is: PulsarAdmin.
> 2023-03-28T07:30:58,583+0000 [main] INFO
> org.apache.pulsar.common.sasl.JAASCredentialsContainer - successfully
> logged in.
> 2023-03-28T07:30:58,587+0000 [pulsar-tgt-refresh-thread] INFO
> org.apache.pulsar.common.sasl.TGTRefreshThread - TGT refresh thread started.
> 2023-03-28T07:30:58,612+0000 [pulsar-tgt-refresh-thread] INFO
> org.apache.pulsar.common.sasl.TGTRefreshThread - Client principal is "
> pulsar-admin@SN.IO".
> 2023-03-28T07:30:58,613+0000 [pulsar-tgt-refresh-thread] INFO
> org.apache.pulsar.common.sasl.TGTRefreshThread - Server principal is
> "krbtgt/SN.IO@SN.IO".
> 2023-03-28T07:30:58,617+0000 [pulsar-tgt-refresh-thread] INFO
> org.apache.pulsar.common.sasl.TGTRefreshThread - TGT valid starting at:
>     Tue Mar 28 07:30:58 UTC 2023
> 2023-03-28T07:30:58,617+0000 [pulsar-tgt-refresh-thread] INFO
> org.apache.pulsar.common.sasl.TGTRefreshThread - TGT expires:
>     Wed Mar 29 07:30:58 UTC 2023
> 2023-03-28T07:30:58,617+0000 [pulsar-tgt-refresh-thread] INFO
> org.apache.pulsar.common.sasl.TGTRefreshThread - TGT refresh sleeping
> until: Wed Mar 29 03:12:29 UTC 2023
> 2023-03-28T07:30:59,861+0000 [main] INFO
> org.apache.pulsar.client.impl.auth.PulsarSaslClient - Using
> JAAS/SASL/GSSAPI auth to connect to server Principal broker/pulsar03,
> HTTP 401 Unauthorized
> Reason: HTTP 401 Unauthorized
> ```
> 
> And I want to cherry-pick https://github.com/apache/pulsar/pull/15121 into
> branch-2.10 to fix it.
> 
> ### Background
> When using Kerberos for authentication, Pulsar works like this:
> - client: init ticket
> - request to broker
> - broker identifies the client (Broker can confirm the ticket is valid by
> Kerberos)
> - sends a token(we call it sasl_role_token) to the client ( at this moment,
> the session is successfully created )
> - then the client will be authenticated through sasl_role_token, do not use
> Kerberos anymore.
> 
> The `sasl_role_token` is generated by this logic: `Sha512(saslRoleName,
> ${secret})`, we call the `secret` sasl_sign_secret.
> In version `2.10.x`, the variable `secret` is a random string initialized
> when the broker starts.
> 
> ### Issue 1
> If a cluster includes two brokers, and a topic `public/default/tp1` is
> owned by broker-0. We will get an error when we call `pulsar-admin topics
> stats public/default/tp1` to broker-1.
> 
> The whole process goes like this:
> - client succeeds in authentication and gets a token from broker-1
> - broker-1 tells the client to redirect to broker-0
> - client request to broker-0 carries the sasl_role_token generated by
> broker-1
> - broker-0 can not decode the sasl_role_token, because it has differ secret
> of broker-1, and responses 401
> 
> ### Issue 2
> After authentication is enabled for both Pulsar-Proxy and Pulsar-Broker,
> the error occurs as follows
> - client succeeds in authentication and gets a token from Pulsar Proxy
> - proxy forwards the request to broker
> - the broker can not decode the `sasl_role_token`, because it has differed
> secret of Pulsar Proxy, and responses 401
> 
> ### solutions
> There have two solutions to solve this issue:
> 
> Solution 1
> - The client saves different tokens for different servers(e.g. ["broker-0",
> "broker-1", "pulsar-proxy"]) so servers will receive the tokens issued by
> each other, then we can fix Issue 1.
> - Proxy and Broker do not enable authentication simultaneously, then we can
> fix Issue 2.
> 
> Solution 2
> - Make `sasl_sign_secret` configurable. Users can configure this variable
> to the same value, then multi servers can decode every
> `sasl_role_token.`  PR #15121 does this.
> 
> I'd prefer Solution 2 because it is already in the master branch, so I want
> to cherry-pick #15121 into branch-2.10.
> 
> ### Forward Compatibility
> In PR #15121, the config `sasl_sign_secret` is a new item in config files.
> Since it is required, users will get a system error if does not set it. To
> ensure forward compatibility, we can make this variable optional in
> branch-2.10
> 
> 
> Thanks
> Yubiao Feng
>