You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2022/05/31 11:19:05 UTC

[GitHub] [pulsar] coderzc opened a new issue, #15859: PIP-172 : Introduce the `HEALTH_CHECK` command in the binary protocol

coderzc opened a new issue, #15859:
URL: https://github.com/apache/pulsar/issues/15859

   ## Motivation
   
   Currently, the broker has an admin `health check` endpoint, which is very good, But in some scenarios, we may not be able to use the admin API (such as https://github.com/apache/pulsar/pull/13316#discussion_r773313991), so I propose to introduce the `HEALTH_CHECK` command in the broker binary protocol.
   
   ## Goal
   
   Introduce the `HEALTH_CHECK` command in the broker binary protocol.
   
   ### Protocol Changes
   
   ```proto
   message CommandHealthCheck {
       enum TopicVersion {
           V1 = 0;
           V2 = 1;
       }
       
       required uint64 request_id = 1;
       required TopicVersion topic_version = 2;
   }
   
   message CommandHealthCheckResponse {
       required uint64 request_id = 1;
       required bool ok = 2 [default = false];
   
       optional ServerError error_code = 3;
       optional string error_message   = 4;
   }
   ```
   When the `ok == true` indicates that the cluster is in a healthy state.
   Conversely, when the `ok == false` that the cluster is unhealthy, the specific error message can view `error_code` and `error_message`.
   
   ### Client API Changes
   
   ```java
   interface PulsarClient {
       // ....
       CompletableFuture<Boolean> healthCheck(TopicVersion topicVersion);
   }
   ```
   
   ## Implementation
   
   * Add a handler function to handle `CommandHealthCheck` at `ServerCnx`.
   * Add a handler function to handle `CommandHealthCheckResponse` at `ClientCnx`.
   * In order to better reuse some functions, move the [internalRunHealthCheck](https://github.com/apache/pulsar/blob/7800fbd5d8cdbeb307f98a8d9408bb5309f41c0d/pulsar-broker/src/main/java/org/apache/pulsar/broker/admin/impl/BrokersBase.java#L385) and [checkDeadlockedThreads](https://github.com/apache/pulsar/blob/7800fbd5d8cdbeb307f98a8d9408bb5309f41c0d/pulsar-broker/src/main/java/org/apache/pulsar/broker/admin/impl/BrokersBase.java#L364) to `BrokerService`.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] codelipenghui commented on issue #15859: PIP-172 : Introduce the `HEALTH_CHECK` command in the binary protocol

Posted by GitBox <gi...@apache.org>.
codelipenghui commented on issue #15859:
URL: https://github.com/apache/pulsar/issues/15859#issuecomment-1159322677

   > we don't need this cost_time? Although it is not accurate enough.
   
   I think we can move out of this proposal. Since we will not select the optimal 
   cluster based on the performance metrics of the cluster.
   
   If we really want to do cluster selection based on the metrics, we can add it.
   Just make the changes focus on the proposal really want to do. The metrics may be
   a point that needs to consider, but it needs to discuss first.
   
   > Because it needs to use the ServiceNameResolver of BinaryProtoLookupService to get the socket address, perhaps should be created a new ServiceNameResolver.
   
   Adding a ServiceNameResolver is better. It's confusing to add a health check method in LookupService.
   
   > Add a new config healthCheckExpiryTime in the proxy server, it express health check result cache expiry time that unit is seconds.
   
   `healtchCheckInternalInSeconds` is more descriptive.
   And why add to the proxy server? not the broker configuration? It's better to add a new section `Configuration Change`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] coderzc commented on issue #15859: PIP-172 : Introduce the `HEALTH_CHECK` command in the binary protocol

Posted by GitBox <gi...@apache.org>.
coderzc commented on issue #15859:
URL: https://github.com/apache/pulsar/issues/15859#issuecomment-1171871901

   @michaeljmarshall I went back to you, please take a look, thanks
   https://lists.apache.org/thread/6td3wyfybsys4vf1tgf7fxy36w10nb5k


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] codelipenghui commented on issue #15859: PIP-172 : Introduce the `HEALTH_CHECK` command in the binary protocol

Posted by GitBox <gi...@apache.org>.
codelipenghui commented on issue #15859:
URL: https://github.com/apache/pulsar/issues/15859#issuecomment-1162595742

   @coderzc I think we should also add a config to disable the new feature. And it's better to make it disabled by default. Others looks good to me.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] coderzc commented on issue #15859: PIP-172 : Introduce the `HEALTH_CHECK` command in the binary protocol

Posted by GitBox <gi...@apache.org>.
coderzc commented on issue #15859:
URL: https://github.com/apache/pulsar/issues/15859#issuecomment-1154677331

   > Will each broker service set up the `healthCheck` service? Does all the client request to the same broker?
   
   @hangc0276 Each broker service will set up the `healthCheck` service.
   
   Client request which broker is determined by `serviceUrl`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] codelipenghui commented on issue #15859: PIP-172 : Introduce the `HEALTH_CHECK` command in the binary protocol

Posted by GitBox <gi...@apache.org>.
codelipenghui commented on issue #15859:
URL: https://github.com/apache/pulsar/issues/15859#issuecomment-1163062821

   > Disable the feature on the client side or on the broker side?
   
   The broker side.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] github-actions[bot] commented on issue #15859: PIP-172 : Introduce the `HEALTH_CHECK` command in the binary protocol

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on issue #15859:
URL: https://github.com/apache/pulsar/issues/15859#issuecomment-1200606202

   The issue had no activity for 30 days, mark with Stale label.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] coderzc commented on issue #15859: PIP-172 : Introduce the `HEALTH_CHECK` command in the binary protocol

Posted by GitBox <gi...@apache.org>.
coderzc commented on issue #15859:
URL: https://github.com/apache/pulsar/issues/15859#issuecomment-1159475435

   @codelipenghui 
   > healtchCheckInternalInSeconds is more descriptive.
   And why add to the proxy server? not the broker configuration? It's better to add a new section Configuration Change
   
   Sorry, I write it wrongly, it is broker configuration. I will add a new section `Configuration Change`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] coderzc commented on issue #15859: PIP-172 : Introduce the `HEALTH_CHECK` command in the binary protocol

Posted by GitBox <gi...@apache.org>.
coderzc commented on issue #15859:
URL: https://github.com/apache/pulsar/issues/15859#issuecomment-1162721863

   > @coderzc I think we should also add a config to disable the new feature. And it's better to make it disabled by default. Others looks good to me.
   
   Disable the feature on the client side or on the broker side?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] coderzc commented on issue #15859: PIP-172 : Introduce the `HEALTH_CHECK` command in the binary protocol

Posted by GitBox <gi...@apache.org>.
coderzc commented on issue #15859:
URL: https://github.com/apache/pulsar/issues/15859#issuecomment-1163290308

   > https://lists.apache.org/thread/6td3wyfybsys4vf1tgf7fxy36w10nb5k
   
   Why need to disable the feature on the broker? The healthCheck is only triggered by the client, not a regular check by broker.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] hangc0276 commented on issue #15859: PIP-172 : Introduce the `HEALTH_CHECK` command in the binary protocol

Posted by GitBox <gi...@apache.org>.
hangc0276 commented on issue #15859:
URL: https://github.com/apache/pulsar/issues/15859#issuecomment-1154003564

   Will each broker service set up the `healthCheck` service? Does all the client request to the same broker?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] codelipenghui commented on issue #15859: PIP-172 : Introduce the `HEALTH_CHECK` command in the binary protocol

Posted by GitBox <gi...@apache.org>.
codelipenghui commented on issue #15859:
URL: https://github.com/apache/pulsar/issues/15859#issuecomment-1158781060

   >  required uint64 cost_time = 3;
   
   We can remove this one? The broker always returns the cached value, not a real publish latency
   
   > interface PulsarClient {
       // ....
       CompletableFuture<HealthCheckResult> healthCheck();
   }
   
   Could you please add some details about how this method will be used?
   It will check all the clusters or the connected cluster?
   
   > interface LookupService {
     //......
     CompletableFuture<HealthCheckResult> healthCheck();
   }
   
   Could you please add some context about why add this method to LookupService?
   
   > Caching healthCheck result 60s on boeker side.
   
   It should be configurable.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] coderzc commented on issue #15859: PIP-172 : Introduce the `HEALTH_CHECK` command in the binary protocol

Posted by GitBox <gi...@apache.org>.
coderzc commented on issue #15859:
URL: https://github.com/apache/pulsar/issues/15859#issuecomment-1158795154

   > Could you please add some details about how this method will be used?
   It will check all the clusters or the connected cluster?
   
   ok, I add some details.
   It will check all the connected cluster.
   
   
   > Could you please add some context about why add this method to LookupService?
   Because it needs to use the ServiceNameResolver of BinaryProtoLookupService to get the socket address, perhaps should be created a new ServiceNameResolver.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] coderzc commented on issue #15859: PIP-172 : Introduce the `HEALTH_CHECK` command in the binary protocol

Posted by GitBox <gi...@apache.org>.
coderzc commented on issue #15859:
URL: https://github.com/apache/pulsar/issues/15859#issuecomment-1159917932

   > +
    @hangc0276 
    The healthCheck service on the broker side should be triggered by client instance of regular check by the broker to avoid  
    other brokers are idle?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] hangc0276 commented on issue #15859: PIP-172 : Introduce the `HEALTH_CHECK` command in the binary protocol

Posted by GitBox <gi...@apache.org>.
hangc0276 commented on issue #15859:
URL: https://github.com/apache/pulsar/issues/15859#issuecomment-1159889696

   > For reducing broker load that broker will trigger healthCheck at regular intervals and collect the result
   
   @coderzc 
   Does the `healthCheck` operation on the broker side triggered by the broker itself? As you described, using plan A will only detect the requested broker, and check whether the broker is healthy or not. If the Pulsar cluster has 1000 brokers, and all the clients connect to the broker by serviceUrl, which only configured three brokers. It means only the three brokers will receive `healthCheck` requests, the other 997 brokers are idle in dealing with `healthCheck` requests. IMO, the `healthCheck` service on broker side is `trigger by client + regular check by broker` is better.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org