You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2021/09/19 20:46:31 UTC

[GitHub] [pulsar] addisonj opened a new pull request #12097: Bugfix: Fix rackaware placement policy init error

addisonj opened a new pull request #12097:
URL: https://github.com/apache/pulsar/pull/12097


   Since the release of Pulsar 2.8 and upgrade to BK 4.12, the default
   rackAwarePlacementPolicy has been failing with the following exception:
   
   ```
   org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl - Failed to initialize DNS Resolver org.apache.pulsar.zookeeper.ZkBookieRackAffinityMapping, used default subnet resolver : java.lang.RuntimeException: java.lang.NullPointerException java.lang.NullPointerException
   ```
   
   This regression occured in commit https://github.com/apache/pulsar/commit/4c6026213b743a7f23ae2a5a6d37ee7404b066db
   
   The core of the issue is that `setConf` is called before
   `setBookieAddressResolver` has been set (see
   https://github.com/apache/bookkeeper/blob/034ef8566ad037937a4d58a28f70631175744f53/bookkeeper-server/src/main/java/org/apache/bookkeeper/client/RackawareEnsemblePlacementPolicyImpl.java#L264-L286)
   
   This results in the NPE.
   
   We are safe to simply not eagerly init the cache in setConf as the
   getRack call will re-check the cache.
   
   We also protect against this possible NPE for safety
   
   This is difficult to test directly but we should have an integration test to validate that PlacementPolicy is working as expected.
   
   CC @eolivelli 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] eolivelli merged pull request #12097: Bugfix: Fix rackaware placement policy init error

Posted by GitBox <gi...@apache.org>.
eolivelli merged pull request #12097:
URL: https://github.com/apache/pulsar/pull/12097


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] addisonj commented on pull request #12097: Bugfix: Fix rackaware placement policy init error

Posted by GitBox <gi...@apache.org>.
addisonj commented on pull request #12097:
URL: https://github.com/apache/pulsar/pull/12097#issuecomment-923017644


   @eolivelli I do think it is worth the test, but I would suggest we make a follow up item to do that. While we could do a smoke test and see the logs, the log is pretty subtle and easy to miss... I would instead suggest we just put the time in to do a proper test that validates placement by querying metadata. However, I don't want to hold up this fix for that test as this is quite a big breakage


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] addisonj commented on a change in pull request #12097: Bugfix: Fix rackaware placement policy init error

Posted by GitBox <gi...@apache.org>.
addisonj commented on a change in pull request #12097:
URL: https://github.com/apache/pulsar/pull/12097#discussion_r712252857



##########
File path: pulsar-zookeeper-utils/src/main/java/org/apache/pulsar/zookeeper/ZkBookieRackAffinityMapping.java
##########
@@ -94,20 +90,25 @@ private void updateRacksWithHost(BookiesRackConfiguration racks) {
                 bookies.forEach((addr, bi) -> {
                     try {
                         BookieId bookieId = BookieId.parse(addr);
-                        BookieSocketAddress bsa = getBookieAddressResolver().resolve(bookieId);
-                        newRacksWithHost.updateBookie(group, bsa.toString(), bi);
-
-                        String hostname = bsa.getSocketAddress().getHostName();
-                        newBookieInfoMap.put(hostname, bi);
-
-                        InetAddress address = bsa.getSocketAddress().getAddress();
-                        if (null != address) {
-                            String hostIp = address.getHostAddress();
-                            if (null != hostIp) {
-                                newBookieInfoMap.put(hostIp, bi);
-                            }
+                        BookieAddressResolver addressResolver = getBookieAddressResolver();
+                        if (addressResolver == null) {
+                            LOG.warn("Bookie address resolver not yet initialized, skipping resolution");

Review comment:
       precautionary, I figure if it can return null we should be defensive




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] eolivelli commented on a change in pull request #12097: Bugfix: Fix rackaware placement policy init error

Posted by GitBox <gi...@apache.org>.
eolivelli commented on a change in pull request #12097:
URL: https://github.com/apache/pulsar/pull/12097#discussion_r711880571



##########
File path: pulsar-zookeeper-utils/src/main/java/org/apache/pulsar/zookeeper/ZkBookieRackAffinityMapping.java
##########
@@ -94,20 +90,25 @@ private void updateRacksWithHost(BookiesRackConfiguration racks) {
                 bookies.forEach((addr, bi) -> {
                     try {
                         BookieId bookieId = BookieId.parse(addr);
-                        BookieSocketAddress bsa = getBookieAddressResolver().resolve(bookieId);
-                        newRacksWithHost.updateBookie(group, bsa.toString(), bi);
-
-                        String hostname = bsa.getSocketAddress().getHostName();
-                        newBookieInfoMap.put(hostname, bi);
-
-                        InetAddress address = bsa.getSocketAddress().getAddress();
-                        if (null != address) {
-                            String hostIp = address.getHostAddress();
-                            if (null != hostIp) {
-                                newBookieInfoMap.put(hostIp, bi);
-                            }
+                        BookieAddressResolver addressResolver = getBookieAddressResolver();
+                        if (addressResolver == null) {
+                            LOG.warn("Bookie address resolver not yet initialized, skipping resolution");

Review comment:
       Does this case happen?
   Or is this only a precautionary null check?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] addisonj commented on pull request #12097: Bugfix: Fix rackaware placement policy init error

Posted by GitBox <gi...@apache.org>.
addisonj commented on pull request #12097:
URL: https://github.com/apache/pulsar/pull/12097#issuecomment-922635916


   /pulsarbot run-failure-checks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] eolivelli commented on pull request #12097: Bugfix: Fix rackaware placement policy init error

Posted by GitBox <gi...@apache.org>.
eolivelli commented on pull request #12097:
URL: https://github.com/apache/pulsar/pull/12097#issuecomment-923028106


   >  I don't want to hold up this fix for that test
   This usually means that we won't be adding such test :-) 
   Usually  as soon as the problem/urgency is fixed no one takes care of following up (there is no more interest in spending cycles).
   
   By the way I don't want to block this patch, I hope you will find time to follow up.
   Please create an issue
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] addisonj commented on pull request #12097: Bugfix: Fix rackaware placement policy init error

Posted by GitBox <gi...@apache.org>.
addisonj commented on pull request #12097:
URL: https://github.com/apache/pulsar/pull/12097#issuecomment-922589562


   /pulsarbot run-failure-checks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] eolivelli commented on pull request #12097: Bugfix: Fix rackaware placement policy init error

Posted by GitBox <gi...@apache.org>.
eolivelli commented on pull request #12097:
URL: https://github.com/apache/pulsar/pull/12097#issuecomment-922650048


   > I believe the test set the DNS resolver manually
   Which test ?
   
   My idea is to add an integration test that reproduces the behaviour of a sys admin that configures this feature.
   It should be easy to see the error in the logs of the test.
   
   It looks like we do not have such kind of integration tests.
   
   @addisonj do you think it is worth do add it ?
   
   btw please answer to my comment about the null check, and I am happy with this patch


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] sijie commented on pull request #12097: Bugfix: Fix rackaware placement policy init error

Posted by GitBox <gi...@apache.org>.
sijie commented on pull request #12097:
URL: https://github.com/apache/pulsar/pull/12097#issuecomment-922649258


   @eolivelli I believe the test set the DNS resolver manually before the test. Hence it doesn't catch the real usage.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org