You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2022/03/11 00:02:16 UTC

[GitHub] [pulsar] mattisonchao opened a new pull request #14648: Failed to start broker when Linux load balancer can't get NIC speed

mattisonchao opened a new pull request #14648:
URL: https://github.com/apache/pulsar/pull/14648


   ### Motivation
   
   The context here: #14537
   
   The current design is frequently logging an error while ignoring the NIC. The only real course of action for an operator is to reconfigure the broker. It seems like a better course of action to fail on startup so that operators know immediately that they need to fix the configuration instead of finding out sometime later when observing the logs.
   
   ### Modifications
   
   - Verify whether VM has NIC speed.
   - Refactor ``LinuxBrokerHostUsageImpl`` and extract some Linux operate to another class.
   
   ### Verifying this change
   
   - [x] Make sure that the change passes the CI checks.
   
   ### Documentation
   
   - [x] `no-need-doc` 
     
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] mattisonchao removed a comment on pull request #14648: Failed to start broker when Linux load balancer can't get NIC speed

Posted by GitBox <gi...@apache.org>.
mattisonchao removed a comment on pull request #14648:
URL: https://github.com/apache/pulsar/pull/14648#issuecomment-1064819854


   @gaozhangmin 
   
   Sorry, I make a mistake.
   We need to get all NIC speeds and then check them, if we don't get any NIC speeds (less than or equal to 0), we need to notify the user to set "loadBalancerOverrideBrokerNicSpeedGbps" to make sure the load balancer works.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] mattisonchao commented on pull request #14648: Failed to start broker when Linux load balancer can't get NIC speed

Posted by GitBox <gi...@apache.org>.
mattisonchao commented on pull request #14648:
URL: https://github.com/apache/pulsar/pull/14648#issuecomment-1064819854


   @gaozhangmin 
   
   Sorry, I make a mistake.
   We need to get all NIC speeds and then check them, if we don't get any NIC speeds (less than or equal to 0), we need to notify the user to set "loadBalancerOverrideBrokerNicSpeedGbps" to make sure the load balancer works.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] Nicklee007 commented on pull request #14648: Failed to start broker when Linux load balancer can't get NIC speed

Posted by GitBox <gi...@apache.org>.
Nicklee007 commented on pull request #14648:
URL: https://github.com/apache/pulsar/pull/14648#issuecomment-1066702913


   > @gaozhangmin
   > 
   > If the Linux host has a legitimate NIC but doesn't have the speed. Regardless of how many NICs have speed, we should notify the user to set `loadBalancerOverrideBrokerNicSpeedGbps` to ensure that the user gets what they expect.
   
   @mattisonchao 
   I think  that we need check all NIC's speed configuration is too strict to start server. In some case we only used eth0 and the eth1 eth2  ... ... not active, but those /sys/class/net/eth1/type configured '1' and read 'speed' file is 'Invalid argument';
   Maybe those not active NIC need excluded from the NICs list which is checked 'speed' file and calculated speed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] gaozhangmin commented on pull request #14648: Failed to start broker when Linux load balancer can't get NIC speed

Posted by GitBox <gi...@apache.org>.
gaozhangmin commented on pull request #14648:
URL: https://github.com/apache/pulsar/pull/14648#issuecomment-1064782341






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] nicoloboschi commented on a change in pull request #14648: Failed to start broker when Linux load balancer can't get NIC speed

Posted by GitBox <gi...@apache.org>.
nicoloboschi commented on a change in pull request #14648:
URL: https://github.com/apache/pulsar/pull/14648#discussion_r824691667



##########
File path: pulsar-broker/src/main/java/org/apache/pulsar/broker/PulsarService.java
##########
@@ -635,6 +636,13 @@ public void start() throws PulsarServerException {
                         + "authenticationEnabled=true when authorization is enabled with authorizationEnabled=true.");
             }
 
+            if (config.isLoadBalancerEnabled() && LinuxInfoUtils.isLinux()) {

Review comment:
       If I set `loadBalancerOverrideBrokerNicSpeedGbps` I should not get the error, right? 
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] mattisonchao commented on pull request #14648: Failed to start broker when Linux load balancer can't get NIC speed

Posted by GitBox <gi...@apache.org>.
mattisonchao commented on pull request #14648:
URL: https://github.com/apache/pulsar/pull/14648#issuecomment-1064824353


   @gaozhangmin 
   
   If the Linux host has a legitimate NIC but doesn't have the speed. Regardless of how many NICs have speed, we should notify the user to set ``loadBalancerOverrideBrokerNicSpeedGbps`` to ensure that the user gets what they expect.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] mattisonchao commented on pull request #14648: Failed to start broker when Linux load balancer can't get NIC speed

Posted by GitBox <gi...@apache.org>.
mattisonchao commented on pull request #14648:
URL: https://github.com/apache/pulsar/pull/14648#issuecomment-1064825132


   @nicoloboschi  @eolivelli @michaeljmarshall  please continue to take a look from #14537.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] gaozhangmin removed a comment on pull request #14648: Failed to start broker when Linux load balancer can't get NIC speed

Posted by GitBox <gi...@apache.org>.
gaozhangmin removed a comment on pull request #14648:
URL: https://github.com/apache/pulsar/pull/14648#issuecomment-1064782340


    linux host has many NICs, There are some NIC that are of type 1 that doesn't expose the speed. 
   
    I think if let broker fail it's not  a good idea. @mattisonchao 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] mattisonchao commented on a change in pull request #14648: Failed to start broker when Linux load balancer can't get NIC speed

Posted by GitBox <gi...@apache.org>.
mattisonchao commented on a change in pull request #14648:
URL: https://github.com/apache/pulsar/pull/14648#discussion_r824709743



##########
File path: pulsar-broker/src/main/java/org/apache/pulsar/broker/PulsarService.java
##########
@@ -635,6 +636,13 @@ public void start() throws PulsarServerException {
                         + "authenticationEnabled=true when authorization is enabled with authorizationEnabled=true.");
             }
 
+            if (config.isLoadBalancerEnabled() && LinuxInfoUtils.isLinux()) {

Review comment:
       You are right, i will fix it.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org