You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2021/04/26 20:52:40 UTC

[GitHub] [pulsar] Rockyyost opened a new issue #10390: Consumers not receiving messages

Rockyyost opened a new issue #10390:
URL: https://github.com/apache/pulsar/issues/10390


   #### Expected behavior
   
   Consumers, in a shared setup, should each receive messages in queue. If we have 4K messages for one topic and we have 20 consumers, each consumer should be receiving messages.
   
   #### Actual behavior
   
   Some of the Consumers will get the messages, while others remain ideal. With 20 consumers, only about 4 of 5 of the consumers will get messages at any given time, while the others will remain ideal.
   
   #### Steps to reproduce
   
   I'm not sure exactly. We're in an EKS cluster and used the Pulsar Helm deployment. By default, we have 4 pods subscribed to a single topic in Pulsar. Once messages are published to that topic, the messages route to those 4 pods. HPA rules in Kubernetes will then spin up more pods, each of which subscribe to the same topic.
   
   #### System configuration
   **Pulsar version**: 2.7.1
   
   I've included the stats and internal stats read out from Pulsar. Let me know if you need anything more.
   
   [stats.txt](https://github.com/apache/pulsar/files/6380202/stats.txt)
   [stats-internal.txt](https://github.com/apache/pulsar/files/6380203/stats-internal.txt)
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] codelipenghui commented on issue #10390: Consumers not receiving messages

Posted by GitBox <gi...@apache.org>.
codelipenghui commented on issue #10390:
URL: https://github.com/apache/pulsar/issues/10390#issuecomment-1058891069


   The issue had no activity for 30 days, mark with Stale label.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] Rockyyost edited a comment on issue #10390: Consumers not receiving messages

Posted by GitBox <gi...@apache.org>.
Rockyyost edited a comment on issue #10390:
URL: https://github.com/apache/pulsar/issues/10390#issuecomment-831232921


   I'm not setting priorities, that I'm aware of. How would I do that? Looking at the Python doc, I don't see a way however, in another doc, I see if it's not set then 0 is the default.
   
   Setting the receiver_queue_size worked once. It was great to see all the consumer working at the same time, getting messages and bring down the backlog size. However, when trying it a second time, it regresses back to only a few of the consumer getting messages and all others sit idle. Overtime, they will all go idle while messages will remain in the queue.
   
   I pulled the last few lines of logs from one of the consumers that have gone idle. The log starts after it completed the last message it received. To me it looks normal, not sure if there are any insights you can pull. Here it is:
   
   `2021-05-03 12:16:08.471 INFO  [140082804930304] ConsumerStatsImpl:65 | Consumer [persistent://public/default/InferForecast, InferForecastWorker, 0] , ConsumerStatsImpl (numBytesRecieved_ = 112243, totalNumBytesRecieved_ = 112243, receivedMsgMap_ = {[Key: Ok, Value: 25], }, ackedMsgMap_ = {[Key: {Result: Ok, ackType: 0}, Value: 24], }, totalReceivedMsgMap_ = {[Key: Ok, Value: 25], }, totalAckedMsgMap_ = {[Key: {Result: Ok, ackType: 0}, Value: 24], })
   2021-05-03 12:26:08.472 INFO  [140082804930304] ConsumerStatsImpl:65 | Consumer [persistent://public/default/InferForecast, InferForecastWorker, 0] , ConsumerStatsImpl (numBytesRecieved_ = 0, totalNumBytesRecieved_ = 112243, receivedMsgMap_ = {}, ackedMsgMap_ = {}, totalReceivedMsgMap_ = {[Key: Ok, Value: 25], }, totalAckedMsgMap_ = {[Key: {Result: Ok, ackType: 0}, Value: 24], })
   2021-05-03 12:36:08.473 INFO  [140082804930304] ConsumerStatsImpl:65 | Consumer [persistent://public/default/InferForecast, InferForecastWorker, 0] , ConsumerStatsImpl (numBytesRecieved_ = 0, totalNumBytesRecieved_ = 112243, receivedMsgMap_ = {}, ackedMsgMap_ = {}, totalReceivedMsgMap_ = {[Key: Ok, Value: 25], }, totalAckedMsgMap_ = {[Key: {Result: Ok, ackType: 0}, Value: 24], })
   `
   
   
   For my second issue, yes, I can confirm that both exist.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] Rockyyost commented on issue #10390: Consumers not receiving messages

Posted by GitBox <gi...@apache.org>.
Rockyyost commented on issue #10390:
URL: https://github.com/apache/pulsar/issues/10390#issuecomment-830091399


   But another way, why would consumer sit ideal when there are loads of messages in the queue for them? 
   
   What settings needs to be looked at? 
   
   Some of the consumer were not ideal at the start of a large set of messaging being pushed, then do become ideal. If I load up Pulsar with 4K messages and have 15 consumers, it I've seen it where toward the end, it's only one consumer doing any real work. All the others sit ideal. There have been times I've seen all the consumers sit ideal and still there are plenty of messages in the queue.
   
   I've confirmed that these are shared and consumers are all subscribing to the same topic name.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] MarvinCai commented on issue #10390: Consumers not receiving messages

Posted by GitBox <gi...@apache.org>.
MarvinCai commented on issue #10390:
URL: https://github.com/apache/pulsar/issues/10390#issuecomment-830777698


   For original issue, if consumers are with same priority, they should get messages in round-robin mode when they still have room in their receiver queue, I'll try look into this and reproduce it.
   For your new issue,  can you confirm the topic and subscription both exist when it's not dispatching messages? Asking because both subscription and topic could be auto delete if they're inactive for certain amount of time.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] MarvinCai commented on issue #10390: Consumers not receiving messages

Posted by GitBox <gi...@apache.org>.
MarvinCai commented on issue #10390:
URL: https://github.com/apache/pulsar/issues/10390#issuecomment-830053905


   Hi there, were all consumers created with same priority or they were created in different priorities?
   The dispatcher will alway try to dispatch to consumers with highest priority first (0 > 1 > 2 ...), only if highest (0) priority has no available consumers then it'll try to dispatch to next priority (1).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] Rockyyost edited a comment on issue #10390: Consumers not receiving messages

Posted by GitBox <gi...@apache.org>.
Rockyyost edited a comment on issue #10390:
URL: https://github.com/apache/pulsar/issues/10390#issuecomment-831232921


   I'm not setting priorities, that I'm aware of. How would I do that? Looking at the Python doc, I don't see a way however, in another doc, I see if it's not set then 0 is the default.
   
   Setting the receiver_queue_size worked once. It was great to see all the consumer working at the same time, getting messages and bring down the backlog size. However, when trying it a second time, it regresses back to only a few of the consumer getting messages and all others sit idle.
   
   I pulled the last few lines of logs from one of the consumers that have gone idle. The log starts after it completed the last message it received. To me it looks normal, not sure if there are any insights you can pull. Here it is:
   
   `2021-05-03 12:16:08.471 INFO  [140082804930304] ConsumerStatsImpl:65 | Consumer [persistent://public/default/InferForecast, InferForecastWorker, 0] , ConsumerStatsImpl (numBytesRecieved_ = 112243, totalNumBytesRecieved_ = 112243, receivedMsgMap_ = {[Key: Ok, Value: 25], }, ackedMsgMap_ = {[Key: {Result: Ok, ackType: 0}, Value: 24], }, totalReceivedMsgMap_ = {[Key: Ok, Value: 25], }, totalAckedMsgMap_ = {[Key: {Result: Ok, ackType: 0}, Value: 24], })
   2021-05-03 12:26:08.472 INFO  [140082804930304] ConsumerStatsImpl:65 | Consumer [persistent://public/default/InferForecast, InferForecastWorker, 0] , ConsumerStatsImpl (numBytesRecieved_ = 0, totalNumBytesRecieved_ = 112243, receivedMsgMap_ = {}, ackedMsgMap_ = {}, totalReceivedMsgMap_ = {[Key: Ok, Value: 25], }, totalAckedMsgMap_ = {[Key: {Result: Ok, ackType: 0}, Value: 24], })
   2021-05-03 12:36:08.473 INFO  [140082804930304] ConsumerStatsImpl:65 | Consumer [persistent://public/default/InferForecast, InferForecastWorker, 0] , ConsumerStatsImpl (numBytesRecieved_ = 0, totalNumBytesRecieved_ = 112243, receivedMsgMap_ = {}, ackedMsgMap_ = {}, totalReceivedMsgMap_ = {[Key: Ok, Value: 25], }, totalAckedMsgMap_ = {[Key: {Result: Ok, ackType: 0}, Value: 24], })
   `
   
   
   For my second issue, yes, I can confirm that both exist.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] Rockyyost edited a comment on issue #10390: Consumers not receiving messages

Posted by GitBox <gi...@apache.org>.
Rockyyost edited a comment on issue #10390:
URL: https://github.com/apache/pulsar/issues/10390#issuecomment-830091399


   Sorry @MarvinCai I didn't see your comment before I added another.
   
   I haven't set priorities on the consumer, that I know of. I use the Python client.
   
   Below is an example of the consumer I created.
   
               self._consumer = self._client.subscribe(TopicName,
                                                       subscription_name=sub_name,
                                                       consumer_type=_pulsar.ConsumerType.Shared,
                                                       receiver_queue_size=0 )
   
   I was able to finally get the expected behavior, at least initially, by setting the receiver_queue_size to 0. Kubernetes spawn the max number of pods and all the consumers in the pod subscribed, got messages, and went to work on those messages. Then, they got more messages until the message queue went down to 0. After which, the pods were destroyed except for three of them, which is minimum we keep around.
   
   However, sometime later, we publish another set of 4K messages and none of the pods, which are still subscribed (as far as I can tell) have received any of those messages. In fact, Pulsar doesn't seem to want to dispatch them. I've tried killing all the pods, which cause Kubernetes to re-spawn them, but even those new pods, after subscribing, get messages.
   
   This leaves Pulsar a bit confusing. Messages are there, new once I created, but it won't dispatch them.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] Rockyyost commented on issue #10390: Consumers not receiving messages

Posted by GitBox <gi...@apache.org>.
Rockyyost commented on issue #10390:
URL: https://github.com/apache/pulsar/issues/10390#issuecomment-831232921


   I'm not setting priorities, that I'm aware of. How would I do that? Looking at the Python doc, I don't see a way however, in another doc, I see if it's not set then 0 is the default.
   
   Setting the receiver_queue_size worked once. It was great to see all the consumer working at the same time, getting messages and bring down the backlog size. However, when trying it a second time, it regresses back to only a few of the consumer getting messages and all others sit idle.
   
   For my second issue, yes, I can confirm that both exist.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org