You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pulsar.apache.org by Apache Pulsar Slack <ap...@gmail.com> on 2019/09/25 09:11:03 UTC

Slack digest for #general - 2019-09-25

2019-09-24 10:21:52 UTC - pradeep: Hi Team,
I started a pulsar consumer in failover mode without specifying the consumer name. Then I killed the consumer and  restarted the  consumer for the same topic with same subscription name (definitely it would have created new random consumer name) but I am not receiving any data from pulsar.
When I checked using pulsar-admin api, it has not removed the previous consumer from the topic stats (as i started in failover the new restared consumer might not have get chance to consumer data)
----
2019-09-24 16:36:02 UTC - Matteo Merli: In the topic stats, it’s reported the TCP connection for all the consumers
----
2019-09-24 16:36:24 UTC - Matteo Merli: can you verify that the original TCP connection for 1st consumer was indeed teared down?
----
2019-09-24 16:49:27 UTC - pradeep: I just shut down the application. Would not it kill the consumers and producers connection if the application is shut down abruptly/ forcefully ?
----
2019-09-24 16:52:14 UTC - Jesse Zhang (Bose): My subscription gets into a bad status that `some unacked message not redelivered`, even after I restarted all the shared clients.   Any ideas why? I have attached the status in the thread.
----
2019-09-24 16:53:28 UTC - Jesse Zhang (Bose): server: standalone pulsar 2.3.1
``````{
     "averageMsgSize": 0.0,
     "deduplicationStatus": "Disabled",
     "msgRateIn": 0.0,
     "msgRateOut": 0.0,
     "msgThroughputIn": 0.0,
     "msgThroughputOut": 0.0,
     "publishers": [],
     "replication": {},
     "storageSize": 9779254,
     "subscriptions": {
         "mock-bmx-als-sub-feat-aa-restst-func": {
             "blockedSubscriptionOnUnackedMsgs": false,
             "consumers": [
                 {
                     "address": "/100.65.150.248:38860",
                     "availablePermits": 82,
                     "blockedConsumerOnUnackedMsgs": false,
                     "clientVersion": "2.4.0",
                     "connectedSince": "2019-09-20T19:21:17.541Z",
                     "consumerName": "my-consumer-name",
                     "metadata": {},
                     "msgRateOut": 0.0,
                     "msgRateRedeliver": 0.0,
                     "msgThroughputOut": 0.0,
                     "unackedMessages": 0
                 }
             ],
             "msgBacklog": 181,
             "msgRateExpired": 0.0,
             "msgRateOut": 0.0,
             "msgRateRedeliver": 0.0,
             "msgThroughputOut": 0.0,
             "type": "Shared",
             "unackedMessages": 0
         }
         }
```
----
2019-09-24 16:53:52 UTC - Matteo Merli: Normally it should.
----
2019-09-24 16:54:58 UTC - Matteo Merli: do you see any errors in broker logs?
----
2019-09-24 16:55:30 UTC - Jesse Zhang (Bose): let me check
----
2019-09-24 16:56:46 UTC - Jesse Zhang (Bose): Also attach the internalStatus, in our usage, we ack the messages out of order.
----
2019-09-24 16:57:03 UTC - Jesse Zhang (Bose): 
----
2019-09-24 16:58:30 UTC - Matteo Merli: this looks ok
----
2019-09-24 17:12:09 UTC - Jesse Zhang (Bose): checked the log covers 24hour before and after the issue happen, no log is printed at error level, no exceptions. I see some warn level log, but seem recurring all the time.
----
2019-09-24 17:16:38 UTC - pradeep: but it was not happening. we had to restart the brokers to bring it to normal
----
2019-09-24 17:19:24 UTC - pradeep: 
----
2019-09-24 17:24:59 UTC - Jesse Zhang (Bose): We get into this scenario when pulsar is in a redelivery test case:  redeliver 5000 messages to the consumer every 10s (client recieved them, but not ack them), and after ~12hours,  we started to see only 48xx messages to be redelivered every time(issue started) . Then, we started to ack all the message. After that, we found that, while unacked message is 0, there are 181 messages in backlog, and we see the `"availablePermits": 82`  (it should be 100).
----
2019-09-24 17:28:29 UTC - Jesse Zhang (Bose): Restarted the client, these 181 messages still not redelivered. in the internal-status, I see the read position is at the end of queue, `"readPosition": "13:17210",`
----
2019-09-24 17:32:56 UTC - Matteo Merli: Read position being there is expected. Though broker should deliver the pending messages first
----
2019-09-24 17:33:42 UTC - Matteo Merli: Would you be able to share a heap dump of broker? That would give access to the dispatcher state and understand if anything went wrong there
----
2019-09-24 17:39:05 UTC - Matteo Merli: One question for reproducing this: how many consumers were attached to the subscription?
----
2019-09-24 17:39:07 UTC - Jesse Zhang (Bose): thanks for the reply.  We have recycled the bad status server already, I don’t have the dump now. Next time I got into this, i will talk our team see if I can share the dump.
----
2019-09-24 17:39:37 UTC - Jesse Zhang (Bose): only 1 consumer attached to the subscription
----
2019-09-24 17:40:04 UTC - Jesse Zhang (Bose): ConsumerType.Shared
----
2019-09-24 17:40:39 UTC - Jesse Zhang (Bose): we use goclient.
<http://github.com/apache/pulsar/pulsar-client-go|github.com/apache/pulsar/pulsar-client-go> v0.0.0-20190507044647-1f4a836a4648
with
<https://archive.apache.org/dist/pulsar/pulsar-2.4.0/DEB/>
running in linux
----
2019-09-24 17:42:18 UTC - Matteo Merli: &gt; We have recycled the bad status server already

One less invasive way of cleaning up the state is to force a topic reload. That would clear any of such issues: `pulsar-admin topics unload $TOPIC`
+1 : Jesse Zhang (Bose), Jim Lambert, Luke Lu
----
2019-09-24 18:38:41 UTC - Devin G. Bost: Does anyone have a suggestion for how to test the output of the `context.newOutputMessage(..)` method?
I found a test in the source code that calls it, but the test is just checking to ensure it doesn’t throw an exception. It’s not actually checking the output.
----
2019-09-24 18:45:02 UTC - Devin G. Bost: Nvm. We found a way.
+1 : David Kjerrumgaard
----
2019-09-24 19:53:37 UTC - cmcgaley: @cmcgaley has joined the channel
thinking_face : Colum
----
2019-09-24 19:58:18 UTC - Luke Lu: What’s the expected behavior of unload on a topic with active producers and consumers?
----
2019-09-24 20:06:33 UTC - Colum: @Colum has joined the channel
----
2019-09-24 20:20:05 UTC - Colum: So I got a question. I have a Geo-replicated cluster setup across three DCs using 2.4.1. It works awesomely, but I'm trying to figure out if I can avoid duplicate messages being consumed by subscribers in each DC. So, Message 1, 2, and 3 comes in. I'd like a consumer in DC 1 to consume Message 1 and 3, consumer in DC to consume Message 2.

I know replications happens asynchronously, which is fine- I just would like to know if there is a way that I could have multi-dc consumption without duplicate messages natively without rolling my own logic into the consumer?
----
2019-09-24 20:23:01 UTC - Addison Higham: @Jerry Peng (or anyone else familiar with the context.stateStore in pulsar functions) it looks like we are getting a NPE when doing a `get` on a key that doesn't exist yet. Looking at <https://github.com/apache/pulsar/blob/master/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/instance/state/StateContextImpl.java#L62>, it seems like this isn't handling the case where the table returns null. Anyone else run into this?
----
2019-09-24 20:28:14 UTC - Addison Higham: it seems like this would be a problem if anyone was use the state store... so wondering if we are just doing something wrong
----
2019-09-24 20:31:36 UTC - Matteo Merli: Producers are notified that the current session is being closed, the topic is closed/reopened and producers and consumers will quickly reconnect
+1 : Luke Lu
----
2019-09-24 20:36:11 UTC - Jerry Peng: @Addison Higham I get the following exception when I try to get a key that doesn’t exist:
```
java.lang.RuntimeException: Failed to retrieve the state value for key 'foo'
        at org.apache.pulsar.functions.instance.ContextImpl.getState(ContextImpl.java:325) ~[pulsar-functions-instance.jar:?]
        at org.apache.pulsar.functions.api.examples.TestFunction.process(TestFunction.java:53) ~[?:?]
        at org.apache.pulsar.functions.api.examples.TestFunction.process(TestFunction.java:43) ~[?:?]
        at org.apache.pulsar.functions.instance.JavaInstance.handleMessage(JavaInstance.java:63) ~[pulsar-functions-instance.jar:?]
        at org.apache.pulsar.functions.instance.JavaInstanceRunnable.run(JavaInstanceRunnable.java:267) [pulsar-functions-instance.jar:?]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_201]
```
----
2019-09-24 20:36:22 UTC - Jerry Peng: how are you using state in your function?
----
2019-09-24 20:37:15 UTC - Addison Higham: we are using `getStateAsync` so that is probably the difference
----
2019-09-24 20:39:26 UTC - Addison Higham: what we are doing: rewriting the `PulsarOffsetBackingStore` to use the `StateContext` apis, right now, that uses a topic and syncs state via writing messages to the topic. Currently, it doesn't plumb through auth values
----
2019-09-24 20:39:35 UTC - Addison Higham: when it creates its own client
----
2019-09-24 20:40:57 UTC - Addison Higham: So, we figured it would be easy to re-work it to use the state API, but with our use case, we need to read state on startup of the `Source` we are implementing, but if it doesn't exist, start with an empty state
----
2019-09-24 20:42:00 UTC - Addison Higham: if we don't want that API to return nulls, then it seems like it should use a named exception instead of a runtime exception as it feels pretty yucky to catch a runtimeException or handle an NPE
----
2019-09-24 20:45:35 UTC - Jerry Peng: @Addison Higham there is a incorrect behavior in the code
----
2019-09-24 20:46:01 UTC - Jerry Peng: getStateSync should still return a valid completable future
----
2019-09-24 20:46:06 UTC - Jerry Peng: I will fix issue
----
2019-09-24 20:46:25 UTC - Addison Higham: with a null? that is what I was thinking made sense
----
2019-09-24 20:46:34 UTC - Jerry Peng: if its getState() and a key doesn’t exist it should return null
----
2019-09-24 20:46:50 UTC - Addison Higham: :thumbsup:  happy to be on the review
----
2019-09-24 20:47:04 UTC - Jerry Peng: yes getStateSync should return a completable future even if the key doesn’t exist
----
2019-09-24 22:07:18 UTC - Luke Lu: So the client library will do that automatically without throwing errors to callers?
----
2019-09-24 22:13:03 UTC - Matteo Merli: Correct. If the unavailability period is &lt; than publishTimeout, clients will see no errors
+1 : Luke Lu
----
2019-09-25 00:17:51 UTC - Jerry Peng: @Addison Higham <https://github.com/apache/pulsar/pull/5272>
----
2019-09-25 00:19:21 UTC - Junli Antolovich: Hello, does anyone have experience in installing Pulsar on windows? I understand it can be installed on Mac and Linux, most of our customer base are using windows server 2012 and up (yes I know server 2012 extended support ends Oct 2023).  What would be the best strategy installing Pulsar on-premise, cluster and standalone?
----
2019-09-25 00:27:47 UTC - Ali Ahmed: @Junli Antolovich pulsar is based on java so it can run on any env supporting jdk8, the startup scripts are written in bash so either a bash shell on windows in needed to someone will have to write a powershell or cmd equivalent which doesn’t exist yet. alternative is run docker containers if you env supports it.
----
2019-09-25 00:53:09 UTC - Junli Antolovich: @Ali Ahmed Thanks for the info. Based on the documentation "System requirements: Pulsar is currently available for MacOS and Linux. To use Pulsar, you need to install Java 8." (<https://pulsar.apache.org/docs/en/standalone/>), which implies it is not available to Windows or other platforms. Or does this document need to be updated?
----
2019-09-25 01:08:29 UTC - Ali Ahmed: it’s accurate there is no cmd or powershell script available to  execute for windows currently, if it’s developed and committed the documentation can be changed.
----
2019-09-25 05:37:12 UTC - pradeep: we are using pulsar proxy setup to get the connection. Is it the issue with proxy setup which is still holding the active connectin with broker inspite of application is in shutdown state.
----
2019-09-25 07:34:42 UTC - Jianfeng Qiao: Anyone used this command "pulsar-admin topics list-in-bundle tenant/namespace options"?
----
2019-09-25 07:38:44 UTC - Poule: @Jianfeng Qiao does not work on my side
----
2019-09-25 07:38:49 UTC - Poule: i get an err
----
2019-09-25 07:41:09 UTC - Jianfeng Qiao: like this "Expected a command, got list-in-bundle"
----
2019-09-25 07:41:36 UTC - Jianfeng Qiao: BTW, I'm using 2.3.1
----
2019-09-25 07:43:02 UTC - Jianfeng Qiao: The document should give an example for each command.
----
2019-09-25 07:52:24 UTC - Poule: same msg in 2.4.1
----