You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2022/09/28 12:49:47 UTC

[GitHub] [pulsar] mattisonchao opened a new pull request, #17870: [improve][broker] Use separate IO event loop group for internal client.

mattisonchao opened a new pull request, #17870:
URL: https://github.com/apache/pulsar/pull/17870

   ### Motivation
   
   In the current implementation, we use the broker IO event loop group for internal clients. If the internal client uses a lot of threads or the exception handling is not good (such as frequent exception thread). It will affect the throughput of the entire broker.
   
   ### Modifications
   
   - Use the separate IO event loop group for the internal clients.
   
   ### Verifying this change
   
   - [x] Make sure that the change passes the CI checks.
   
   ### Documentation
   
   - [x] `doc-not-needed` 
   (Please explain why)
   
   ### Matching PR in forked repository
   
   PR in forked repository:


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] lhotari commented on pull request #17870: [improve][broker] Use separate IO event loop group for internal client.

Posted by GitBox <gi...@apache.org>.
lhotari commented on PR #17870:
URL: https://github.com/apache/pulsar/pull/17870#issuecomment-1262048972

   > If the internal client uses a lot of threads or the exception handling is not good (such as frequent exception thread). It will affect the throughput of the entire broker.
   
   please elaborate more on this. What do you mean with "internal client uses a lot of threads" or "exception handling is not good"? Please provide examples.
   
   I can see why this is needed since the Pulsar code is not non-blocking. Running synchronized methods on Netty IO threads is harmful since it will become a performance bottleneck. We should make these problems visible and eventually address them.
   
   To detect blocking code, there's https://github.com/reactor/BlockHound that could be used. [Netty comes with BlockHound integration](https://github.com/netty/netty/blob/4.1/common/src/main/resources/META-INF/services/reactor.blockhound.integration.BlockHoundIntegration). BlockHound could be used in unit tests and integration tests to ensure that blocking method calls aren't done on Netty IO threads. With Pulsar we'd have long ways to fullfil this requirement. This just shows that we need a better concurrency model to guide the future development of Pulsar in Pulsar 3.0 and beyond that. The current situation is more or less a mess from a concurrency model perspective.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] mattisonchao commented on pull request #17870: [improve][broker] Use separate IO event loop group for internal client.

Posted by GitBox <gi...@apache.org>.
mattisonchao commented on PR #17870:
URL: https://github.com/apache/pulsar/pull/17870#issuecomment-1262333938

   Close this PR and try to fix the root probelm.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] eolivelli commented on a diff in pull request #17870: [improve][broker] Use separate IO event loop group for internal client.

Posted by GitBox <gi...@apache.org>.
eolivelli commented on code in PR #17870:
URL: https://github.com/apache/pulsar/pull/17870#discussion_r983433654


##########
pulsar-broker/src/main/java/org/apache/pulsar/broker/PulsarService.java:
##########
@@ -332,6 +334,10 @@ public PulsarService(ServiceConfiguration config,
 
         this.ioEventLoopGroup = EventLoopUtil.newEventLoopGroup(config.getNumIOThreads(), config.isEnableBusyWait(),
                 new DefaultThreadFactory("pulsar-io"));
+        // A shared io event group for broker clients to avoid internal clients from occupying IO
+        // threads for a long time and affecting the throughput of the broker.
+        this.brokerClientSharedIoEventLoopGroup = EventLoopUtil.newEventLoopGroup(config.getNumIOThreads(),
+                config.isEnableBusyWait(), new DefaultThreadFactory("broker-client-shared-io"));

Review Comment:
   I am not sure that busyWait applies here.
   I would force it to false



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] mattisonchao commented on pull request #17870: [improve][broker] Use separate IO event loop group for internal client.

Posted by GitBox <gi...@apache.org>.
mattisonchao commented on PR #17870:
URL: https://github.com/apache/pulsar/pull/17870#issuecomment-1262114675

   Hi @lhotari 
   Thanks for your professional explanation. 
   
   > Adding more threads and thread pools (event loops) is a bandaid to the problem, but doesn't solve the root cause.
   
   I totally agree with you, but in the current implementation, we have too much historical baggage and potential impact. For this situation, my recommendation is to lower the sphere of influence first. Then keep making it even better.
   
   Another problem is that we share the IO pool with all replicators. This means that if we have too many replicators on the broker, they will compete for resources with the broker. Plus, by design, I don't think it's a good idea for one component to compete with the entire broker. For example, a problem with a topic or replicator should not affect the entire broker.
   
   Please let me know what you think. thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] codelipenghui closed pull request #17870: [improve][broker] Use separate IO event loop group for internal client.

Posted by GitBox <gi...@apache.org>.
codelipenghui closed pull request #17870: [improve][broker] Use separate IO event loop group for internal client.
URL: https://github.com/apache/pulsar/pull/17870


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] mattisonchao commented on a diff in pull request #17870: [improve][broker] Use separate IO event loop group for internal client.

Posted by GitBox <gi...@apache.org>.
mattisonchao commented on code in PR #17870:
URL: https://github.com/apache/pulsar/pull/17870#discussion_r982365151


##########
pulsar-broker/src/main/java/org/apache/pulsar/broker/PulsarService.java:
##########
@@ -332,6 +334,10 @@ public PulsarService(ServiceConfiguration config,
 
         this.ioEventLoopGroup = EventLoopUtil.newEventLoopGroup(config.getNumIOThreads(), config.isEnableBusyWait(),
                 new DefaultThreadFactory("pulsar-io"));
+        // A shared io event group for broker clients to avoid internal clients from occupying IO
+        // threads for a long time and affecting the throughput of the broker.
+        this.brokerClientSharedIoEventLoopGroup = EventLoopUtil.newEventLoopGroup(config.getNumIOThreads(),

Review Comment:
   `config.getNumIOThreads()` 
   If we consider using the different values for the internal clients, we can introduce the new config in the future.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] lhotari commented on pull request #17870: [improve][broker] Use separate IO event loop group for internal client.

Posted by GitBox <gi...@apache.org>.
lhotari commented on PR #17870:
URL: https://github.com/apache/pulsar/pull/17870#issuecomment-1262151799

   > we have too much historical baggage and potential impact.
   
   It would be good to clarify this if you have something specific to point out. I can see that there are issues with blocking code. It would be useful if you could point out clear examples of problems. 
   
   > Another problem is that we share the IO pool with all replicators. This means that if we have too many replicators on the broker, they will compete for resources with the broker.
   
   Sharing the IO pool isn't a problem on its own. They will compete on the same resources even in the case of using separate thread pools. 
   
   There's a detail in Pulsar that the BookKeeper client uses the Pulsar Broker IO threads by default, code at https://github.com/apache/pulsar/blob/df5e0e1869ff7ce55489e4a7853172fa37b2b59d/pulsar-broker/src/main/java/org/apache/pulsar/broker/BookKeeperClientFactoryImpl.java#L95-L97 
   
   What do you think about that solution? :) 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] mattisonchao closed pull request #17870: [improve][broker] Use separate IO event loop group for internal client.

Posted by GitBox <gi...@apache.org>.
mattisonchao closed pull request #17870: [improve][broker] Use separate IO event loop group for internal client.
URL: https://github.com/apache/pulsar/pull/17870


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org