You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2022/02/11 23:18:23 UTC

[GitHub] [pulsar] merlimat opened a new pull request #14252: Fixed detecting number of NICs in EC2

merlimat opened a new pull request #14252:
URL: https://github.com/apache/pulsar/pull/14252


   ### Motivation
   
   In some EC2 instances we get an error when trying to read the NIC speed: 
   
   ```
   $ cat /sys/class/net/ens5/speed
   cat: /sys/class/net/ens5/speed: Invalid argument
   ```
   
   When that happens, we're ignoring that NIC and it causes that we cannot even manually override the NIC capacity in broker.conf, since the value that is configured is adjusted on the number of NICs. When we ignore that, the number gets then multiplied by 0.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] michaeljmarshall commented on a change in pull request #14252: Fixed detecting number of NICs in EC2

Posted by GitBox <gi...@apache.org>.
michaeljmarshall commented on a change in pull request #14252:
URL: https://github.com/apache/pulsar/pull/14252#discussion_r815015492



##########
File path: pulsar-broker/src/main/java/org/apache/pulsar/broker/loadbalance/impl/LinuxBrokerHostUsageImpl.java
##########
@@ -231,8 +232,15 @@ private boolean isPhysicalNic(Path path) {
                 Files.readAllBytes(path.resolve("speed"));
                 return true;
             } catch (Exception e) {
-                // wireless nics don't report speed, ignore them.
-                return false;
+                // In some cases, VMs in EC2 won't have the speed reported on the NIC and will give a read-error.
+                // Check the type to make sure it's ethernet (type "1")
+                try {
+                    String type = new String(Files.readAllBytes(path.resolve("type")), StandardCharsets.UTF_8).trim();
+                    return Integer.parseInt(type) == 1;

Review comment:
       > When that happens, we're ignoring that NIC and it causes that we cannot even manually override the NIC capacity in broker.conf
   
   I see now that the point was to make it so it could be overridden.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] michaeljmarshall commented on a change in pull request #14252: Fixed detecting number of NICs in EC2

Posted by GitBox <gi...@apache.org>.
michaeljmarshall commented on a change in pull request #14252:
URL: https://github.com/apache/pulsar/pull/14252#discussion_r815009835



##########
File path: pulsar-broker/src/main/java/org/apache/pulsar/broker/loadbalance/impl/LinuxBrokerHostUsageImpl.java
##########
@@ -231,8 +232,15 @@ private boolean isPhysicalNic(Path path) {
                 Files.readAllBytes(path.resolve("speed"));
                 return true;
             } catch (Exception e) {
-                // wireless nics don't report speed, ignore them.
-                return false;
+                // In some cases, VMs in EC2 won't have the speed reported on the NIC and will give a read-error.
+                // Check the type to make sure it's ethernet (type "1")
+                try {
+                    String type = new String(Files.readAllBytes(path.resolve("type")), StandardCharsets.UTF_8).trim();
+                    return Integer.parseInt(type) == 1;

Review comment:
       @merlimat - by returning true here, in some cases, don't we also need to update the logic in getNicSpeedPath? We're seeing the new error here https://github.com/apache/pulsar/pull/14340 because this class is returning true in a new case.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] michaeljmarshall commented on pull request #14252: Fixed detecting number of NICs in EC2

Posted by GitBox <gi...@apache.org>.
michaeljmarshall commented on pull request #14252:
URL: https://github.com/apache/pulsar/pull/14252#issuecomment-1051120567


   Given that this change can lead to new and verbose error logs, we should highlight in the release notes how to mitigate this error.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] codelipenghui merged pull request #14252: Fixed detecting number of NICs in EC2

Posted by GitBox <gi...@apache.org>.
codelipenghui merged pull request #14252:
URL: https://github.com/apache/pulsar/pull/14252


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org