You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by "RobertIndie (via GitHub)" <gi...@apache.org> on 2023/02/16 10:14:58 UTC

[GitHub] [pulsar] RobertIndie opened a new pull request, #19540: [fix][proxy] Fix using wrong client version in pulsar proxy

RobertIndie opened a new pull request, #19540:
URL: https://github.com/apache/pulsar/pull/19540

   <!--
   ### Contribution Checklist
     
     - PR title format should be *[type][component] summary*. For details, see *[Guideline - Pulsar PR Naming Convention](https://docs.google.com/document/d/1d8Pw6ZbWk-_pCKdOmdvx9rnhPiyuxwq60_TrD68d7BA/edit#heading=h.trs9rsex3xom)*. 
   
     - Fill out the template below to describe the changes contributed by the pull request. That will give reviewers the context they need to do the review.
     
     - Each pull request should address only one issue, not mix up code from multiple issues.
     
     - Each commit in the pull request has a meaningful commit message
   
     - Once all items of the checklist are addressed, remove the above text and this checklist, leaving only the filled out template below.
   -->
   
   
   ### Motivations
   
   Currently, if we connect the client to the proxy, the `clientVersion` won't be send to the broker and we can't get the client version using the PulsarAdmin.
   
   For example, there is no `clientVersion` field shown in the output of topic stats:
   ```
   "publishers" : [ {
       "accessMode" : "Shared",
       "msgRateIn" : 0.0,
       "msgThroughputIn" : 0.0,
       "averageMsgSize" : 0.0,
       "chunkedMessageRate" : 0.0,
       "producerId" : 0,
       "metadata" : { },
       "address" : "/127.0.0.1:65385",
       "producerName" : "AlvaroProducer",
       "connectedSince" : "2023-02-16T11:34:30.384548+08:00"
     } ],
   ```
   
   It works fine when directly connecting to the broker.
   
   The root cause is that the pulsar proxy doesn't pass the clientVersion from the client to the broker. It set it to `Pulsar proxy`. And thus it will be ignored due to here : 
   https://github.com/apache/pulsar/blob/master/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/ServerCnx.java#L693-L695
   
   ### Modifications
   
   * Use the correct clientVersion from the client
   
   ### Verifying this change
   
   - [ ] Make sure that the change passes the CI checks.
   
   This change added tests and can be verified as follows:
   
   * Start proxy and pulsar broker
   * Start a consumer
   * Get topic stats from the topic and check the clientVersion field.
   
   ### Does this pull request potentially affect one of the following parts:
   
   <!-- DO NOT REMOVE THIS SECTION. CHECK THE PROPER BOX ONLY. -->
   
   *If the box was checked, please highlight the changes*
   
   - [ ] Dependencies (add or upgrade a dependency)
   - [ ] The public API
   - [ ] The schema
   - [ ] The default values of configurations
   - [ ] The threading model
   - [ ] The binary protocol
   - [ ] The REST endpoints
   - [ ] The admin CLI options
   - [ ] The metrics
   - [ ] Anything that affects deployment
   
   ### Documentation
   
   <!-- DO NOT REMOVE THIS SECTION. CHECK THE PROPER BOX ONLY. -->
   
   - [ ] `doc` <!-- Your PR contains doc changes. -->
   - [ ] `doc-required` <!-- Your PR changes impact docs and you will update later -->
   - [x] `doc-not-needed` <!-- Your PR changes do not impact docs -->
   - [ ] `doc-complete` <!-- Docs have been already added -->
   
   ### Matching PR in forked repository
   
   PR in forked repository: <!-- ENTER URL HERE -->
   
   <!--
   After opening this PR, the build in apache/pulsar will fail and instructions will
   be provided for opening a PR in the PR author's forked repository.
   
   apache/pulsar pull requests should be first tested in your own fork since the 
   apache/pulsar CI based on GitHub Actions has constrained resources and quota.
   GitHub Actions provides separate quota for pull requests that are executed in 
   a forked repository.
   
   The tests will be run in the forked repository until all PR review comments have
   been handled, the tests pass and the PR is approved by a reviewer.
   -->
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] michaeljmarshall commented on pull request #19540: [fix][proxy] Fix using wrong client version in pulsar proxy

Posted by "michaeljmarshall (via GitHub)" <gi...@apache.org>.
michaeljmarshall commented on PR #19540:
URL: https://github.com/apache/pulsar/pull/19540#issuecomment-1439536226

   > Hi, @michaeljmarshall . Thanks for your comment.
   > 
   > > The client version is still filtered out for the Java client for the same reason that the Pulsar proxy client version is filtered out.
   > 
   > Do you have reproducible steps? It seems to work fine for me. The java client should always set the client_version here:
   > 
   > https://github.com/apache/pulsar/blob/954f406fab03ce8858883335ce6d5a676bb45c92/pulsar-client/src/main/java/org/apache/pulsar/client/impl/ClientCnx.java#L295-L297
   > 
   > It will not use `Pulsar Client` as the value here:
   > 
   > https://github.com/apache/pulsar/blob/954f406fab03ce8858883335ce6d5a676bb45c92/pulsar-common/src/main/java/org/apache/pulsar/common/protocol/Commands.java#L198
   
   Sorry, I read the code too quickly, I see that the Java Client's version string works as you described.
   
   > > Further, it would be helpful to know when a connection is being proxied, which brings into scope this PR. Instead of using just one clientVersion, I think we should add a new protobuf field specifically for the proxyVersion. I would also make the restriction that a client cannot set this field unless it authenticates as a proxy (assuming authentication/authorization is enabled).
   > 
   > I'm +1 for this. But from my understanding, this is out of the scope of this PR discussion. This is adding a new feature to show the version of the proxy that the client connects.
   
   I think that depends on whether the behavior before this PR was a bug. In my opinion, changing the `clientVersion` that the proxy sends is a larger change that increases the scope of this PR.
    
   > > Further, I think it is a stretch to call this a bug. I view it as just a missing feature.
   > 
   > We could only get the client version from the broker if the client connects to the broker directly instead of through the proxy. It results in different behavior when using the proxy. I think it's a bug because the proxy should be transparent to users for the `client_version`.
   
   To say "the proxy should be transparent to users for the `client_version`" seems more like finding a potentially inconsistent design decision than finding a bug. Ultimately, the question is how should the `client_version` field be used? Is it for users or administrators? I can see the argument for end users not needing to know the proxy is in the middle, but I can also see an argument for operators wanting to know that the client connected through the proxy and to know which version of the proxy is connected. It is very relevant auditing information.
   
   > Also, I have created a discussion here: https://lists.apache.org/thread/n59k537fhthjnzkfxtc2p4zk4l0cv3mp to discuss about the client version conflict issue. Welcome to join this thread.
   
   Thanks, I replied. I think the scope of the discussion could be increased, though.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] RobertIndie commented on pull request #19540: [fix][proxy] Fix using wrong client version in pulsar proxy

Posted by "RobertIndie (via GitHub)" <gi...@apache.org>.
RobertIndie commented on PR #19540:
URL: https://github.com/apache/pulsar/pull/19540#issuecomment-1441417476

   @michaeljmarshall 
   
   > Ultimately, the question is how should the client_version field be used? Is it for users or administrators? I can see the argument for end users not needing to know the proxy is in the middle, but I can also see an argument for operators wanting to know that the client connected through the proxy and to know which version of the proxy is connected. It is very relevant auditing information.
   
   I get your points now. From [the document](https://pulsar.apache.org/docs/next/administration-stats/#topic-stats) :
   > clientVersion is The client library version of this producer/consumer.
   
    I think the `client_version` should be literally represented as the client version but not the proxy version. And let's add a new field called `proxy version` as you just proposed. This way, the user does not need to bother whether the client Version refers to the directly connected client(which may be the proxy) or the proxied client. The end user could get it from `client_version` and the operators could get it from `proxy_version`.
   
   According to the document above, I think this PR is still a bug fix. Getting the clientVersion from a proxy is not worked before this PR, therefore, this PR doesn't change the existing behavior. From my understanding, this PR is good to go.
   
   I would like to propose a new PIP to add such a feature because it will change the protocol. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] michaeljmarshall commented on a diff in pull request #19540: [fix][proxy] Fix using wrong client version in pulsar proxy

Posted by "michaeljmarshall (via GitHub)" <gi...@apache.org>.
michaeljmarshall commented on code in PR #19540:
URL: https://github.com/apache/pulsar/pull/19540#discussion_r1110116306


##########
pulsar-proxy/src/main/java/org/apache/pulsar/proxy/server/ProxyClientCnx.java:
##########
@@ -66,7 +65,7 @@ protected ByteBuf newConnectCommand() throws Exception {
         authenticationDataProvider = authentication.getAuthData(remoteHostName);
         AuthData authData = authenticationDataProvider.authenticate(AuthData.INIT_AUTH_DATA);
         return Commands.newConnect(authentication.getAuthMethodName(), authData, protocolVersion,
-                PulsarVersion.getVersion(), proxyToTargetBrokerAddress, clientAuthRole, clientAuthData,
+                proxyConnection.clientVersion, proxyToTargetBrokerAddress, clientAuthRole, clientAuthData,

Review Comment:
   This change is incorrect. When we use the `ProxyClientCnx`, the proxy uses its own version of the protocol with the broker, not the original client's version.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] michaeljmarshall commented on pull request #19540: [fix][proxy] Fix using wrong client version in pulsar proxy

Posted by "michaeljmarshall (via GitHub)" <gi...@apache.org>.
michaeljmarshall commented on PR #19540:
URL: https://github.com/apache/pulsar/pull/19540#issuecomment-1442789467

   I just opened https://github.com/apache/pulsar/issues/19623, please let me know what you think.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] RobertIndie commented on a diff in pull request #19540: [fix][proxy] Fix using wrong client version in pulsar proxy

Posted by "RobertIndie (via GitHub)" <gi...@apache.org>.
RobertIndie commented on code in PR #19540:
URL: https://github.com/apache/pulsar/pull/19540#discussion_r1112505757


##########
pulsar-proxy/src/main/java/org/apache/pulsar/proxy/server/ProxyClientCnx.java:
##########
@@ -66,7 +65,7 @@ protected ByteBuf newConnectCommand() throws Exception {
         authenticationDataProvider = authentication.getAuthData(remoteHostName);
         AuthData authData = authenticationDataProvider.authenticate(AuthData.INIT_AUTH_DATA);
         return Commands.newConnect(authentication.getAuthMethodName(), authData, protocolVersion,
-                PulsarVersion.getVersion(), proxyToTargetBrokerAddress, clientAuthRole, clientAuthData,
+                proxyConnection.clientVersion, proxyToTargetBrokerAddress, clientAuthRole, clientAuthData,

Review Comment:
   The `clientVersion` here is different from the `protocol version`. There is another field called `protoclVersion` here:
   https://github.com/apache/pulsar/blob/954f406fab03ce8858883335ce6d5a676bb45c92/pulsar-common/src/main/proto/PulsarApi.proto#L273
   The proxy will choose the lowest protocol version for itself and the client:
   https://github.com/apache/pulsar/blob/954f406fab03ce8858883335ce6d5a676bb45c92/pulsar-proxy/src/main/java/org/apache/pulsar/proxy/server/ProxyConnection.java#L685-L687
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] RobertIndie merged pull request #19540: [fix][proxy] Fix using wrong client version in pulsar proxy

Posted by "RobertIndie (via GitHub)" <gi...@apache.org>.
RobertIndie merged PR #19540:
URL: https://github.com/apache/pulsar/pull/19540


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] michaeljmarshall commented on pull request #19540: [fix][proxy] Fix using wrong client version in pulsar proxy

Posted by "michaeljmarshall (via GitHub)" <gi...@apache.org>.
michaeljmarshall commented on PR #19540:
URL: https://github.com/apache/pulsar/pull/19540#issuecomment-1442582917

   Thanks for sharing that documentation @RobertIndie. I agree that `client_version` should be the client's version, and I am fine with classifying this as a bug. I think a PIP sounds like a good solution. I put together this PR today to show the basics of the PIP. I am wondering if we could also use this information in the prometheus metrics. Let me know what you think, and I'll put together a PIP.
   
   Thanks for discussing this problem!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] RobertIndie commented on pull request #19540: [fix][proxy] Fix using wrong client version in pulsar proxy

Posted by "RobertIndie (via GitHub)" <gi...@apache.org>.
RobertIndie commented on PR #19540:
URL: https://github.com/apache/pulsar/pull/19540#issuecomment-1437826208

   Hi, @michaeljmarshall . Thanks for your comment.
   
   > The client version is still filtered out for the Java client for the same reason that the Pulsar proxy client version is filtered out. 
   
   Do you have reproducible steps? It seems to work fine for me. The java client should always set the client_version here: 
   https://github.com/apache/pulsar/blob/954f406fab03ce8858883335ce6d5a676bb45c92/pulsar-client/src/main/java/org/apache/pulsar/client/impl/ClientCnx.java#L295-L297
   It will not  use `Pulsar Client` as the value here:
   https://github.com/apache/pulsar/blob/954f406fab03ce8858883335ce6d5a676bb45c92/pulsar-common/src/main/java/org/apache/pulsar/common/protocol/Commands.java#L198
   
   > In my opinion, the right solution is to consider removing the logic that filters out versions with the " " and then to make sure that all official clients have a meaningful version string.
   
   Agree. I thought that it was used for testing. But it doesn't seem to be useful now. I think we can remove this logic if there is no other objections.
   
   > Further, it would be helpful to know when a connection is being proxied, which brings into scope this PR. Instead of using just one clientVersion, I think we should add a new protobuf field specifically for the proxyVersion. I would also make the restriction that a client cannot set this field unless it authenticates as a proxy (assuming authentication/authorization is enabled).
   
   I'm +1 for this. But from my understanding, this is out of the scope of this PR discussion. This is adding a new feature to show the version of the proxy that the client connects. 
   
   > Further, I think it is a stretch to call this a bug. I view it as just a missing feature.
   
   We could only get the client version from the broker if the client connects to the broker directly instead of through the proxy. It results in different behavior when using the proxy. I think it's a bug because the proxy should be transparent to users for the `client_version`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] michaeljmarshall commented on a diff in pull request #19540: [fix][proxy] Fix using wrong client version in pulsar proxy

Posted by "michaeljmarshall (via GitHub)" <gi...@apache.org>.
michaeljmarshall commented on code in PR #19540:
URL: https://github.com/apache/pulsar/pull/19540#discussion_r1113585241


##########
pulsar-proxy/src/main/java/org/apache/pulsar/proxy/server/ProxyClientCnx.java:
##########
@@ -66,7 +65,7 @@ protected ByteBuf newConnectCommand() throws Exception {
         authenticationDataProvider = authentication.getAuthData(remoteHostName);
         AuthData authData = authenticationDataProvider.authenticate(AuthData.INIT_AUTH_DATA);
         return Commands.newConnect(authentication.getAuthMethodName(), authData, protocolVersion,
-                PulsarVersion.getVersion(), proxyToTargetBrokerAddress, clientAuthRole, clientAuthData,
+                proxyConnection.clientVersion, proxyToTargetBrokerAddress, clientAuthRole, clientAuthData,

Review Comment:
   Yes, I am familiar with the difference for the protocol and the client version. However, it seems debatable to me which client version is connected here because the proxy is re-writing all requests instead of just forwarding them. For the purpose of troubleshooting, it seems valuable to know that the proxy is in the middle for these lookup connections.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org