You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@helix.apache.org by "parakhnr (via GitHub)" <gi...@apache.org> on 2023/04/18 16:42:04 UTC

[GitHub] [helix] parakhnr opened a new pull request, #2452: Fixing the ZkServer.start() to just check for free port when starting ZK

parakhnr opened a new pull request, #2452:
URL: https://github.com/apache/helix/pull/2452

   ### Issues
   - [X] My PR addresses the following Helix issues and references them in the PR description:
   
   Starting a ZK server is very time consuming and it's affecting the time taken to debug the integration tests since each test run starts ZK. This PR aims to optimise the start time.
   
   ### Description
   
   - [X] Here are some details about my PR, including screenshots of any UI changes:
   
   Currently `ZkServer.start()` queries all of the network interfaces to find if `localhost` is present in the interfaces whereas it will always be present since we manually add [localhost](https://github.com/apache/helix/blob/386a77d566f1dc0b480c3bcbdb4a2880a8b8a4a9/zookeeper-api/src/main/java/org/apache/helix/zookeeper/zkclient/NetworkUtil.java#L43) to the list of resolved network ips and hostnames.  The call to `NetworkInterface.getNetworkInterfaces` is expensive and it is the reason for the increased time to start the Zk server. We also invoke the same method twice one just for logging and other for verifying 😞 
   
   Since ZKServer never accepts a Zk hostname to connect, the check to verify if the port is busy should be sufficient. 
   
   ### Tests
   - [X] The following tests are written for this issue:
   
   Since this is a helper class, no unit tests exist for this class. However, integration tests and helix examples cover this path. I verified that Helix examples were running all good.
   
   - [X] The following is the result of the "mvn test" command on the appropriate module:
   
   Since the test classes are spread across I ran the `PR_CI` workflow on this commit and below are the results of it's execution.
   ```
   [info] ./helix-core/target/surefire-reports/TestSuite.txt: Tests run: 1325, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 5,743.07 s - in TestSuite
   [info] ./metadata-store-directory-common/target/surefire-reports/TestSuite.txt: Tests run: 31, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.886 s - in TestSuite
   [info] ./helix-common/target/surefire-reports/TestSuite.txt: Tests run: 0, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.405 s - in TestSuite
   [info] ./metrics-common/target/surefire-reports/TestSuite.txt: Tests run: 0, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.384 s - in TestSuite
   [info] ./helix-lock/target/surefire-reports/TestSuite.txt: Tests run: 11, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 59.772 s - in TestSuite
   [info] ./helix-view-aggregator/target/surefire-reports/TestSuite.txt: Tests run: 15, Failures: 1, Errors: 0, Skipped: 1, Time elapsed: 60.445 s <<< FAILURE! - in TestSuite
   Error:  Test failed: testHelixViewAggregator(org.apache.helix.view.integration.TestHelixViewAggregator)  Time elapsed: 31.594 s  <<< FAILURE!
   [info] ./helix-rest/target/surefire-reports/TestSuite.txt: Tests run: 209, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 166.17 s - in TestSuite
   [info] ./zookeeper-api/target/surefire-reports/TestSuite.txt: Tests run: 85, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 158.15 s - in TestSuite
   [info] ./recipes/task-execution/target/surefire-reports/TestSuite.txt: Tests run: 0, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.585 s - in TestSuite
   [info] ./recipes/service-discovery/target/surefire-reports/TestSuite.txt: Tests run: 0, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.637 s - in TestSuite
   [info] ./recipes/distributed-lock-manager/target/surefire-reports/TestSuite.txt: Tests run: 0, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.615 s - in TestSuite
   [info] ./recipes/rsync-replicated-file-system/target/surefire-reports/TestSuite.txt: Tests run: 0, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.62 s - in TestSuite
   [info] ./recipes/rabbitmq-consumer-group/target/surefire-reports/TestSuite.txt: Tests run: 0, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.624 s - in TestSuite
   ```
   
   ### Commits
   
   - [X] My commits all reference appropriate Apache Helix GitHub issues in their subject lines. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)":
     1. Subject is separated from body by a blank line
     1. Subject is limited to 50 characters (not including Jira issue reference)
     1. Subject does not end with a period
     1. Subject uses the imperative mood ("add", not "adding")
     1. Body wraps at 72 characters
     1. Body explains "what" and "why", not "how"
   
   ### Documentation (Optional)
   
   - [X] In case of new functionality, my PR adds documentation in the following wiki page:
   N/A
   
   ### Code Quality
   
   - [X] My diff has been formatted using helix-style.xml 
   (helix-style-intellij.xml if IntelliJ IDE is used)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] parakhnr commented on pull request #2452: Fixing the ZkServer.start() to just check for free port when starting ZK

Posted by "parakhnr (via GitHub)" <gi...@apache.org>.
parakhnr commented on PR #2452:
URL: https://github.com/apache/helix/pull/2452#issuecomment-1515176015

   This PR is ready to merge, approved by @junkaixue.
   
   Commit message:
   Fixing the ZkServer.start() to just check for free port when starting ZK.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] junkaixue commented on pull request #2452: Fixing the ZkServer.start() to just check for free port when starting ZK

Posted by "junkaixue (via GitHub)" <gi...@apache.org>.
junkaixue commented on PR #2452:
URL: https://github.com/apache/helix/pull/2452#issuecomment-1513886908

   Would that be possible to make it configurable. I would like to keep the current behavior that as I am not sure how many people use this class. But I dont think that would be much as this should be only used for testing.
   
   If you dont have a better way to make it configurable, let's keep it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] junkaixue commented on a diff in pull request #2452: Fixing the ZkServer.start() to just check for free port when starting ZK

Posted by "junkaixue (via GitHub)" <gi...@apache.org>.
junkaixue commented on code in PR #2452:
URL: https://github.com/apache/helix/pull/2452#discussion_r1170385924


##########
zookeeper-api/src/main/java/org/apache/helix/zookeeper/zkclient/NetworkUtil.java:
##########
@@ -22,77 +22,14 @@
 import java.io.IOException;
 import java.net.ConnectException;
 import java.net.InetAddress;
-import java.net.NetworkInterface;
 import java.net.Socket;
 import java.net.SocketException;
 import java.net.UnknownHostException;
-import java.util.Enumeration;
-import java.util.HashSet;
-import java.util.Set;
 
 public class NetworkUtil {
 
     public final static String OVERWRITE_HOSTNAME_SYSTEM_PROPERTY = "zkclient.hostname.overwritten";
 
-    public static String[] getLocalHostNames() {

Review Comment:
   Delete public API is not allowed in open source project. Even if we are not using it in another of our code, it will be dangerous that open source users may use it.
   
   Please keep it here, if no place to use this API in our code, you can annotate with @Deprecated.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] parakhnr commented on pull request #2452: Fixing the ZkServer.start() to just check for free port when starting ZK

Posted by "parakhnr (via GitHub)" <gi...@apache.org>.
parakhnr commented on PR #2452:
URL: https://github.com/apache/helix/pull/2452#issuecomment-1515067820

   > > > Would that be possible to make it configurable. I would like to keep the current behavior that as I am not sure how many people use this class. But I dont think that would be much as this should be only used for testing. If you dont have a better way to make it configurable, let's keep it.
   > > 
   > > 
   > > Hmmm.. we aren't changing the current behavior at all. We are just getting rid of the redundant calls.
   > > 
   > > * 1st call to `NetworkUtil.getLocalHostNames()` is used for logging.
   > > * 2nd call to `NetworkUtil.getLocalHostNames()` is used to search for `localhost` in the list of hostnames which will always be present cause we explicitly add it to the list over [here](https://github.com/apache/helix/blob/386a77d566f1dc0b480c3bcbdb4a2880a8b8a4a9/zookeeper-api/src/main/java/org/apache/helix/zookeeper/zkclient/NetworkUtil.java#L43).
   > > 
   > > I feel both of these calls are something we can avoid.
   > 
   > Sorry, it leads you misunderstand it. What I mean "keep it" means keep this change.
   
   Ohh oops. Sorry I misunderstood it 😬  I would prefer to keep it this way since making it configurable would complicate the change and it's something I feel we should not do unless required.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] junkaixue merged pull request #2452: Fixing the ZkServer.start() to just check for free port when starting ZK

Posted by "junkaixue (via GitHub)" <gi...@apache.org>.
junkaixue merged PR #2452:
URL: https://github.com/apache/helix/pull/2452


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] parakhnr commented on pull request #2452: Fixing the ZkServer.start() to just check for free port when starting ZK

Posted by "parakhnr (via GitHub)" <gi...@apache.org>.
parakhnr commented on PR #2452:
URL: https://github.com/apache/helix/pull/2452#issuecomment-1513920300

   > Would that be possible to make it configurable. I would like to keep the current behavior that as I am not sure how many people use this class. But I dont think that would be much as this should be only used for testing. If you dont have a better way to make it configurable, let's keep it.
   
   Hmmm.. we aren't changing the current behavior at all. We are just getting rid of the redundant calls. 
   * 1st call to `NetworkUtil.getLocalHostNames()` is used for logging.
   * 2nd call to `NetworkUtil.getLocalHostNames()` is used to search for `localhost` in the list of hostnames which will always be present cause we explicitly add it to the list over [here](https://github.com/apache/helix/blob/386a77d566f1dc0b480c3bcbdb4a2880a8b8a4a9/zookeeper-api/src/main/java/org/apache/helix/zookeeper/zkclient/NetworkUtil.java#L43).
   
   I feel both of these calls are something we can avoid.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] vjagadish1989 commented on a diff in pull request #2452: Fixing the ZkServer.start() to just check for free port when starting ZK

Posted by "vjagadish1989 (via GitHub)" <gi...@apache.org>.
vjagadish1989 commented on code in PR #2452:
URL: https://github.com/apache/helix/pull/2452#discussion_r1170709953


##########
zookeeper-api/src/main/java/org/apache/helix/zookeeper/zkclient/ZkServer.java:
##########
@@ -83,55 +80,25 @@ public int getPort() {
 
     @PostConstruct
     public void start() {
-        final String[] localHostNames = NetworkUtil.getLocalHostNames();
-        String names = "";
-        for (int i = 0; i < localHostNames.length; i++) {
-            final String name = localHostNames[i];
-            names += " " + name;
-            if (i + 1 != localHostNames.length) {
-                names += ",";
-            }
-        }
-        LOG.info("Starting ZkServer on: [" + names + "] port " + _port + "...");
         startZooKeeperServer();
         _zkClient = new ZkClient(new ZkConnection("localhost:" + _port), 10000, -1, new BasicZkSerializer(new SerializableSerializer()), null, null, null, false);
         _defaultNameSpace.createDefaultNameSpace(_zkClient);
     }
 
     private void startZooKeeperServer() {
-        final String[] localhostHostNames = NetworkUtil.getLocalHostNames();

Review Comment:
   seems like `getLocalHostNames` is merely used for validation/logging. - both of which seem wasteful when starting a local ZK server,
   
   curious how much time each call takes - it appears to enumerate interfaces etc.? a quick micro benchmark can tell us.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] junkaixue commented on pull request #2452: Fixing the ZkServer.start() to just check for free port when starting ZK

Posted by "junkaixue (via GitHub)" <gi...@apache.org>.
junkaixue commented on PR #2452:
URL: https://github.com/apache/helix/pull/2452#issuecomment-1515058067

   > > Would that be possible to make it configurable. I would like to keep the current behavior that as I am not sure how many people use this class. But I dont think that would be much as this should be only used for testing. If you dont have a better way to make it configurable, let's keep it.
   > 
   > Hmmm.. we aren't changing the current behavior at all. We are just getting rid of the redundant calls.
   > 
   > * 1st call to `NetworkUtil.getLocalHostNames()` is used for logging.
   > * 2nd call to `NetworkUtil.getLocalHostNames()` is used to search for `localhost` in the list of hostnames which will always be present cause we explicitly add it to the list over [here](https://github.com/apache/helix/blob/386a77d566f1dc0b480c3bcbdb4a2880a8b8a4a9/zookeeper-api/src/main/java/org/apache/helix/zookeeper/zkclient/NetworkUtil.java#L43).
   > 
   > I feel both of these calls are something we can avoid.
   
   Sorry, it leads you misunderstand it. What I mean "keep it" means keep this change. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] parakhnr commented on a diff in pull request #2452: Fixing the ZkServer.start() to just check for free port when starting ZK

Posted by "parakhnr (via GitHub)" <gi...@apache.org>.
parakhnr commented on code in PR #2452:
URL: https://github.com/apache/helix/pull/2452#discussion_r1170389242


##########
zookeeper-api/src/main/java/org/apache/helix/zookeeper/zkclient/NetworkUtil.java:
##########
@@ -22,77 +22,14 @@
 import java.io.IOException;
 import java.net.ConnectException;
 import java.net.InetAddress;
-import java.net.NetworkInterface;
 import java.net.Socket;
 import java.net.SocketException;
 import java.net.UnknownHostException;
-import java.util.Enumeration;
-import java.util.HashSet;
-import java.util.Set;
 
 public class NetworkUtil {
 
     public final static String OVERWRITE_HOSTNAME_SYSTEM_PROPERTY = "zkclient.hostname.overwritten";
 
-    public static String[] getLocalHostNames() {

Review Comment:
   Ohh ok. Gotcha! I won't mark it as deprecated since it's utility class and it delivers on the functionality. I will just revert the file to it's original state.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] parakhnr commented on a diff in pull request #2452: Fixing the ZkServer.start() to just check for free port when starting ZK

Posted by "parakhnr (via GitHub)" <gi...@apache.org>.
parakhnr commented on code in PR #2452:
URL: https://github.com/apache/helix/pull/2452#discussion_r1171656029


##########
zookeeper-api/src/main/java/org/apache/helix/zookeeper/zkclient/ZkServer.java:
##########
@@ -83,55 +80,25 @@ public int getPort() {
 
     @PostConstruct
     public void start() {
-        final String[] localHostNames = NetworkUtil.getLocalHostNames();
-        String names = "";
-        for (int i = 0; i < localHostNames.length; i++) {
-            final String name = localHostNames[i];
-            names += " " + name;
-            if (i + 1 != localHostNames.length) {
-                names += ",";
-            }
-        }
-        LOG.info("Starting ZkServer on: [" + names + "] port " + _port + "...");
         startZooKeeperServer();
         _zkClient = new ZkClient(new ZkConnection("localhost:" + _port), 10000, -1, new BasicZkSerializer(new SerializableSerializer()), null, null, null, false);
         _defaultNameSpace.createDefaultNameSpace(_zkClient);
     }
 
     private void startZooKeeperServer() {
-        final String[] localhostHostNames = NetworkUtil.getLocalHostNames();

Review Comment:
   I ran the experiment locally where I queried `NetworkUtil.getLocalHostNames()` in loop for 10 times. The total time for each iteration is ~50seconds and most expensive operation is `InetAddress.getCanonicalHostName()` which took ~5seconds each time we invoked it.
   
   NOTE: The number varies on number of the Network interfaces on the machine and number of IP addresses bound to that network interface. In the experiment above I had 10 Network interfaces and each network interface had 1 IP address associated with it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] parakhnr commented on a diff in pull request #2452: Fixing the ZkServer.start() to just check for free port when starting ZK

Posted by "parakhnr (via GitHub)" <gi...@apache.org>.
parakhnr commented on code in PR #2452:
URL: https://github.com/apache/helix/pull/2452#discussion_r1171656029


##########
zookeeper-api/src/main/java/org/apache/helix/zookeeper/zkclient/ZkServer.java:
##########
@@ -83,55 +80,25 @@ public int getPort() {
 
     @PostConstruct
     public void start() {
-        final String[] localHostNames = NetworkUtil.getLocalHostNames();
-        String names = "";
-        for (int i = 0; i < localHostNames.length; i++) {
-            final String name = localHostNames[i];
-            names += " " + name;
-            if (i + 1 != localHostNames.length) {
-                names += ",";
-            }
-        }
-        LOG.info("Starting ZkServer on: [" + names + "] port " + _port + "...");
         startZooKeeperServer();
         _zkClient = new ZkClient(new ZkConnection("localhost:" + _port), 10000, -1, new BasicZkSerializer(new SerializableSerializer()), null, null, null, false);
         _defaultNameSpace.createDefaultNameSpace(_zkClient);
     }
 
     private void startZooKeeperServer() {
-        final String[] localhostHostNames = NetworkUtil.getLocalHostNames();

Review Comment:
   I ran the experiment locally where I queried `NetworkUtil.getLocalHostNames()` in loop for 10 times. The total time for each iteration is ~50seconds and most expensive operation is `InetAddress.getCanonicalHostName()` which took ~5seconds. 
   
   NOTE: The number varies on number of the Network interfaces on the machine and number of IP addresses bound to that network interface. In the experiment above I had 10 Network interfaces and each network interface had 1 IP address associated with it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org