You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pulsar.apache.org by Apache Pulsar Slack <ap...@gmail.com> on 2020/06/10 09:11:06 UTC
Slack digest for #general - 2020-06-10

2020-06-09 09:40:30 UTC - Manoj: @Manoj has joined the channel
----
2020-06-09 09:47:55 UTC - Igor: @Igor has joined the channel
----
2020-06-09 09:50:01 UTC - Manoj: When will pulsar 2.6.0 released?
----
2020-06-09 09:51:00 UTC - Penghui Li: PRs of 2.6.0 are merged, I will publish a vote soon.
+1 : charles, Manoj, Konstantinos Papalias, Kirill Kosenko, Julius S, Sijie Guo
----
2020-06-09 10:02:11 UTC - skans100: @skans100 has joined the channel
----
2020-06-09 10:06:08 UTC - AaronRobert: @AaronRobert has joined the channel
----
2020-06-09 10:17:44 UTC - Shishir Pandey: Looks like PIP-31 will be in 2.7 and not in 2.6, is this accurate ?
----
2020-06-09 10:55:17 UTC - Marcio Martins: I verified the whole EKS OIDC process is working. EKS is injecting the environment variable and the secret with the token in the pod, but somehow it seems the offloader will prefer to use the node's instance data instead of the web authority. Found this, which I suspect might be what's going on: <https://github.com/aws/aws-sdk-java/issues/2136>

I need to verify what version is being used by the offloader/pulsar.
----
2020-06-09 11:18:54 UTC - Dhakshin: I am running standalone pulsar (on my laptop as a single node). So, I understand your point but i need to update standalone.conf in my case (rather than broker.conf).

One observations : In the standalone.conf I could not see the below property. As per the pulsar documentation (<https://pulsar.apache.org/docs/en/reference-metrics/#consumer-metrics>), looks like this is required.

*After that I added the below property manually in standalone.conf, Now, it's working fine.*

# Enable consumer level metrics. default is false
*exposeConsumerLevelMetricsInPrometheus=true*

2. I have requirement to monitor Ingress and egress ratio of pulsar. Please suggest which metrics should be considered for this.

3. The below metrics values are coming as 0 (even though I produced and consumed 1000 messages using sample produce and consumer java code) always. Please let me know if there is any reason for this.

*pulsar_rate_in, pulsar_rate_out, pulsar_throughput_in and pulsar_throughput_out*

Thank you
----
2020-06-09 13:43:14 UTC - Arnaud Briche: @Arnaud Briche has joined the channel
----
2020-06-09 14:31:40 UTC - jujugrrr: Hi, I'm trying to force delete a topic but I'm getting a 500 internal server error, on the broker I can see:
```14:27:01.520 [pulsar-web-33-5] ERROR org.apache.pulsar.broker.admin.impl.PersistentTopicsBase - [null] Failed to delete topic forcefully <persistent://ten/ns/my-topic>
java.util.concurrent.ExecutionException: org.apache.pulsar.broker.service.BrokerServiceException$TopicBusyException: Topic has 1 connected producers/consumers
        at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) ~[?:1.8.0_232]```
Should the force not bypass this?
----
2020-06-09 14:33:05 UTC - jujugrrr: also, pulsar-admin returns no active subscriptions
----
2020-06-09 14:35:17 UTC - jujugrrr: I've terminated the topic, does it prevent the deletion?
----
2020-06-09 14:51:55 UTC - tuteng: You can try pulsar-admin unload?
----
2020-06-09 14:52:39 UTC - jujugrrr: that worked, thanks!
----
2020-06-09 14:52:48 UTC - jujugrrr: not sure to understand why
----
2020-06-09 14:52:54 UTC - jujugrrr: :slightly_smiling_face: I;m new to pulsar
----
2020-06-09 15:10:22 UTC - Alexandre DUVAL: how to activate it?
----
2020-06-09 15:39:22 UTC - Alexandre DUVAL: so from another node recover from the node which i want to decomission, ok
----
2020-06-09 15:49:58 UTC - Alexandre DUVAL: `15:49:03.043 [main] INFO  org.apache.bookkeeper.client.BookKeeperAdmin - Count of Ledgers which need to be rereplicated: 359` :confused:
----
2020-06-09 15:59:51 UTC - Alexandre DUVAL: ```ledgerID: 347973
LedgerMetadata{formatVersion=2, ensembleSize=1, writeQuorumSize=1, ackQuorumSize=1, state=CLOSED, length=120816610, lastEntryId=463, digestType=CRC32, password=base64:, ensembles={0=[clevercloud-bookkeeper-c1-n1:3181]}, customMetadata={}}```
----
2020-06-09 16:00:30 UTC - Alexandre DUVAL: i can't update them to have ensembleSize = 2?
----
2020-06-09 16:00:37 UTC - Alexandre DUVAL: @Sijie Guo ^
----
2020-06-09 16:03:34 UTC - Addison Higham: I think it should be all clients, it is applied by the server
----
2020-06-09 16:12:31 UTC - Alexandre DUVAL: can i see the topics who produced in these ensembleSize=1 ledgers?
----
2020-06-09 16:19:45 UTC - Alexandre DUVAL: + i reviewed all our namespaces and every persistence policies has ensembleSize=2 :confused:
----
2020-06-09 16:20:05 UTC - Sijie Guo: You can use `bin/bookkeeper shell recover --query` to query the ledgers that contain given bookies
----
2020-06-09 16:28:20 UTC - Alexandre DUVAL: how? `/pulsar/bin/bookkeeper shell recover --query clevercloud-bookkeeper-c1-n1:3148`  returns
----
2020-06-09 16:28:38 UTC - Alexandre DUVAL: ```16:27:56.878 [main] INFO  org.apache.bookkeeper.tools.cli.commands.bookies.RecoverCommand - Construct admin : org.apache.bookkeeper.client.BookKeeperAdmin@45be7cd5
16:27:59.318 [ZkLedgerManagerScheduler-7-1] INFO  org.apache.bookkeeper.client.BookKeeperAdmin - GetLedgersContainBookies completed with rc : 0
NOTE: Bookies in inspection list are marked with '*'.
Done
16:27:59.442 [main] INFO  org.apache.zookeeper.ZooKeeper - Session: 0x209072cbb8f0248 closed
16:27:59.442 [main-EventThread] INFO  org.apache.zookeeper.ClientCnxn - EventThread shut down for session: 0x209072cbb8f0248```
----
2020-06-09 16:31:23 UTC - Alexandre DUVAL: got ~1000 ledgers in this situation
----
2020-06-09 16:31:27 UTC - Alexandre DUVAL: ```clevercloud-bookkeeper-c1-n1 ~ # /pulsar/bin/bookkeeper shell listledgers -meta | grep "ensembleSize=1" | wc
JMX enabled by default
   1014   10482  227769```

----
2020-06-09 16:31:30 UTC - Alexandre DUVAL: for all cluster
----
2020-06-09 16:31:47 UTC - Alexandre DUVAL: and logically ~360 per node (2 nodes)
----
2020-06-09 16:33:30 UTC - Alexandre DUVAL: can i get the topics produced on them?
----
2020-06-09 16:33:42 UTC - jujugrrr: Hi, I'm testing pulsar with pulsar-perf and a reader. I can see after a while Ledgers are getting full and new one are created. If I restart my reader and try to point to the first message I can only read from the first message in the current Ledger. Is it an expected behavior? Is there a way to read the first message in the topic even if ledgers rolled over many times?
----
2020-06-09 16:35:02 UTC - Matteo Merli: Yes, with readers you need to configure data retention on the topic. <https://pulsar.apache.org/docs/en/2.5.2/cookbooks-retention-expiry/#docsNav>
----
2020-06-09 16:35:06 UTC - Alexandre DUVAL: or timestamps ?
----
2020-06-09 16:39:26 UTC - Alexandre DUVAL: I have inconsistencies like
```Jun 09 16:27:17 clevercloud-bookkeeper-c1-n1 pulsar-bookie[7855]: 16:27:17.588 [BookieReadThreadPool-OrderedExecutor-6-0] INFO  org.apache.bookkeeper.proto.ReadEntryProcessorV3 - No ledger found while reading entry: 0 from ledger: 774                                                                                     Jun 09 16:27:17 clevercloud-bookkeeper-c1-n1 pulsar-bookie[7855]: 16:27:17.591 [BookieReadThreadPool-OrderedExecutor-6-0] INFO  org.apache.bookkeeper.proto.ReadEntryProcessorV3 - No ledger found while reading entry: 0 from ledger: 774                                                                                     Jun 09 16:27:17 clevercloud-bookkeeper-c1-n1 pulsar-bookie[7855]: 16:27:17.619 [BookieReadThreadPool-OrderedExecutor-0-0] INFO  org.apache.bookkeeper.proto.ReadEntryProcessorV3 - No ledger found while reading entry: 0 from ledger: 31208                
Jun 09 16:27:17 clevercloud-bookkeeper-c1-n1 pulsar-bookie[7855]: 16:27:17.622 [BookieReadThreadPool-OrderedExecutor-0-0] INFO  org.apache.bookkeeper.proto.ReadEntryProcessorV3 - No ledger found while reading entry: 0 from ledger: 31208
Jun 09 16:27:17 clevercloud-bookkeeper-c1-n1 pulsar-bookie[7855]: 16:27:17.655 [BookieReadThreadPool-OrderedExecutor-0-0] INFO  org.apache.bookkeeper.proto.ReadEntryProcessorV3 - No ledger found while reading entry: 0 from ledger: 64800
Jun 09 16:27:17 clevercloud-bookkeeper-c1-n1 pulsar-bookie[7855]: 16:27:17.662 [BookieReadThreadPool-OrderedExecutor-0-0] INFO  org.apache.bookkeeper.proto.ReadEntryProcessorV3 - No ledger found while reading entry: 0 from ledger: 64800```
----
2020-06-09 16:39:43 UTC - Alexandre DUVAL: and when i try to read them
----
2020-06-09 16:39:45 UTC - Alexandre DUVAL: ```16:39:01.876 [BookKeeperClientWorker-OrderedExecutor-1-0] ERROR org.apache.bookkeeper.client.PendingReadOp - Read of ledger entry failed: L239 E0-E0, Sent to [clevercloud-bookkeeper-c1-n1:3181], Heard from [] : bitset = {}, Error = 'No such ledger exists on Bookies'. First unread entry is (-1, rc = null)```
----
2020-06-09 16:39:52 UTC - Alexandre DUVAL: like for ledger 774
----
2020-06-09 16:40:15 UTC - Alexandre DUVAL: i think i have differences on real ledgers vs metadata present
----
2020-06-09 16:40:52 UTC - Alexandre DUVAL: there is a way to rebuild metadata from ledgers available or something like that?
----
2020-06-09 16:41:23 UTC - Alexandre DUVAL: (much questions sorry, but i think it's only one problem)
----
2020-06-09 16:43:54 UTC - Alexandre DUVAL: ```ledgerID: 239
LedgerMetadata{formatVersion=2, ensembleSize=1, writeQuorumSize=1, ackQuorumSize=1, state=OPEN, digestType=CRC32, password=base64:, ensembles={0=[clevercloud-bookkeeper-c1-n1:3181]}, customMetadata={}}```
vs
```16:43:44.716 [BookKeeperClientWorker-OrderedExecutor-1-0] ERROR org.apache.bookkeeper.client.PendingReadOp - Read of ledger entry failed: L239 E0-E0, Sent to [clevercloud-bookkeeper-c1-n1:3181], Heard from [] : bitset = {}, Error = 'No such ledger exists on Bookies'. First unread entry is (-1, rc = null)```

----
2020-06-09 16:44:46 UTC - Alexandre DUVAL: @Sijie Guo (only the last message covers all i discussed here ^)
----
2020-06-09 16:46:33 UTC - Alexandre DUVAL: ~1000ledgers in this situation
----
2020-06-09 17:05:43 UTC - Alexandre DUVAL: any suggestion?
----
2020-06-09 17:15:26 UTC - Alexandre DUVAL: metadataformat offers options?
----
2020-06-09 17:25:23 UTC - Sijie Guo: pick one ledger and run `bin/bookkeeper shell ledger -m ledger_id`
----
2020-06-09 17:25:29 UTC - Sijie Guo: Can you see the returned result?
----
2020-06-09 17:26:36 UTC - Alexandre DUVAL: ```clevercloud-bookkeeper-c1-n1 ~ # /pulsar/bin/bookkeeper shell ledger -m 239
JMX enabled by default
ERROR: initializing dbLedgerStorage Entry -1 not found in 239```
----
2020-06-09 17:27:13 UTC - Sijie Guo: Yes. It will be in 2.7. We want to deliver 2.6 first with some of the long outstanding features.
----
2020-06-09 17:27:14 UTC - Alexandre DUVAL: (metadata)
```ledgerID: 239
LedgerMetadata{formatVersion=2, ensembleSize=1, writeQuorumSize=1, ackQuorumSize=1, state=OPEN, digestType=CRC32, password=base64:, ensembles={0=[clevercloud-bookkeeper-c1-n1:3181]}, customMetadata={}}```
----
2020-06-09 17:28:21 UTC - Alexandre DUVAL: @Sijie Guo ^
----
2020-06-09 17:29:49 UTC - Sijie Guo: the transaction is not supported yet. although if you are looking for atomic producing messages into one single partition, you can use batching and de-duplication to achieve so.

See <https://github.com/streamnative/pulsar-examples/blob/master/clients/pubsub/src/main/java/io/streamnative/examples/pubsub/SinglePartitionAtomicProducerExample.java#L51>

Example: <https://github.com/streamnative/pulsar-examples/blob/master/clients/pubsub/examples/single-partition-atomic-writes.md>
----
2020-06-09 17:30:43 UTC - Sijie Guo: okay seems `customMetadata` is empty. so it seems that the ledger was created by bookie sanitycheck
----
2020-06-09 17:31:25 UTC - Sijie Guo: Then you can use `bin/bookkeeper shell deleteledger` to delete those ledgers whose ensemble size is 1.
----
2020-06-09 17:32:15 UTC - Alexandre DUVAL: I have like ~1000 ledgers to delete, anybatch solution ? :confused:
----
2020-06-09 17:32:39 UTC - Alexandre DUVAL: what throwed this creation? what is bookiesantity?
----
2020-06-09 17:32:55 UTC - Sijie Guo: you can write a bash script to do that, can you?
+1 : Alexandre DUVAL
----
2020-06-09 17:33:05 UTC - Sijie Guo: I think it is bookiesanity
----
2020-06-09 17:33:39 UTC - Alexandre DUVAL: and why/how?
----
2020-06-09 17:34:12 UTC - Sijie Guo: bookiesanity creates ensemble-size=1 ledger
----
2020-06-09 17:34:30 UTC - Sijie Guo: did you run bookiesanity command?
----
2020-06-09 17:35:29 UTC - Alexandre DUVAL: i don't think so (not alone to work on this)
----
2020-06-09 17:35:50 UTC - Sijie Guo: simpletest?
----
2020-06-09 17:36:02 UTC - Sijie Guo: did you run any other tools?
----
2020-06-09 17:36:19 UTC - Alexandre DUVAL: metaformat a long time ago
----
2020-06-09 17:36:26 UTC - Sijie Guo: This usually happens when people using bookiesanity or other tools for health check
----
2020-06-09 17:36:35 UTC - Alexandre DUVAL: ````If your goal is to completely wipe the "old" data, then I suggest you run the pulsar cluster initialization (which you have done), the BK metadata format `bin/bookkeeper shell metaformat`, AND format the local filesystem data on a bookie using the bookieformat command on each bookie; `bin/bookkeeper shell bookieformat````
----
2020-06-09 17:36:42 UTC - Alexandre DUVAL: i applied this, like 10months ago
----
2020-06-09 17:37:29 UTC - Alexandre DUVAL: so decomission should ignore these ledgersmetadata?
----
2020-06-09 17:37:36 UTC - Alexandre DUVAL: (in the future, i mean)
----
2020-06-09 17:38:11 UTC - Alexandre DUVAL: and bookiesanity should mark metadata as it created metadata?
----
2020-06-09 17:40:35 UTC - Sijie Guo: &gt; decomission should ignore these ledgersmetadata?
yes
----
2020-06-09 17:40:36 UTC - Atri Sharma: @Atri Sharma has joined the channel
----
2020-06-09 17:40:44 UTC - Sijie Guo: &gt; bookiesanity should mark metadata as it created metadata?
yes
----
2020-06-09 19:30:02 UTC - jujugrrr: thanks @Matteo Merli I already set up retention/ttl/back log quota:
```{
  "retentionTimeInMinutes" : 99999999,
  "retentionSizeInMB" : 99999999999999
}
pulsar-admin namespaces get-message-ttl ten/ns
0
$ pulsar-admin namespaces get-backlog-quotas ten/ns
{
  "destination_storage" : {
    "limit" : -1073741824,
    "policy" : "producer_request_hold"
  }
}```
----
2020-06-09 19:31:33 UTC - jujugrrr: I'm not consuming the messages, just reading, so only the backlog matters I guess
----
2020-06-09 19:33:03 UTC - Matteo Merli: Ok. You can also use -1 to have infinite retention.

When you create a reader you have to specify `MessageId.earliest` as the starting point.
----
2020-06-09 19:33:39 UTC - jujugrrr: yep, but when I do this it's not pointing to the first ledge first entry, it's pointing at the first entry in the current opened ledgers
----
2020-06-09 19:34:59 UTC - jujugrrr: I can see see it because the message id is always (openedledger:first entry)
```id='(72,0,-1,0)'```
----
2020-06-09 19:38:28 UTC - jujugrrr: looking at zookeeper, when it creates the new ledger the old one is actually removed
----
2020-06-09 19:45:11 UTC - jujugrrr: ```19:37:58.824 [bookkeeper-ml-workers-OrderedExecutor-6-0] INFO  org.apache.bookkeeper.mledger.impl.OpAddEntry - [ten/ns/persistent/my-topic] Closing ledger 73 for being full
19:37:58.832 [pulsar-ordered-OrderedExecutor-1-0-EventThread] WARN  org.apache.bookkeeper.client.BookieWatcherImpl - New ensemble: [xxx-pulsar-bookie-2.xxx-pulsar-bookie.xxx-pulsar.svc.cluster.local:3181, xxx-pulsar-bookie-1.xxx-pulsar-bookie.xxx-pulsar.svc.cluster.local:3181] is not adhering to Placement Policy. quarantinedBookies: []
19:37:58.841 [pulsar-ordered-OrderedExecutor-1-0-EventThread] INFO  org.apache.bookkeeper.client.LedgerCreateOp - Ensemble: [xxx-pulsar-bookie-2.xxx-pulsar-bookie.xxx-pulsar.svc.cluster.local:3181, xxx-pulsar-bookie-1.xxx-pulsar-bookie.xxx-pulsar.svc.cluster.local:3181] for ledger: 74
19:37:58.841 [pulsar-ordered-OrderedExecutor-1-0-EventThread] INFO  org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl - [ten/ns/persistent/my-topic] Created new ledger 74
19:37:58.937 [bookkeeper-ml-workers-OrderedExecutor-6-0] INFO  org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl - [ten/ns/persistent/my-topic] End TrimConsumedLedgers. ledgers=1 totalSize=1166573
19:37:58.937 [bookkeeper-ml-workers-OrderedExecutor-6-0] INFO  org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl - [ten/ns/persistent/my-topic] Removing ledger 72 - size: 14333058
19:37:58.938 [bookkeeper-ml-workers-OrderedExecutor-6-0] INFO  org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl - [ten/ns/persistent/my-topic] Removing ledger 73 - size: 623361629```
----
2020-06-09 19:45:33 UTC - jujugrrr: Do I need to configure something to not remove the full ledger?
----
2020-06-09 19:51:18 UTC - jujugrrr: changing the retention to -1 seems to work :disappointed:
----
2020-06-09 20:49:12 UTC - Marcio Martins: Hi, I am trying to test tiered storage, and after fixing the permissions problem with S3, now I encountered a new issue, this time related to Pulsar:
```Completing metadata for offload of ledger 0 with uuid 985af5bc-48de-43c9-907f-5919b9a40f63
Failed to complete offload of ledger 0, uuid 985af5bc-48de-43c9-907f-5919b9a40f63
Ledger 0 no longer exists in ManagedLedger, likely trimmed```
I have very long retention setup:
```defaultRetentionTimeInMinutes=131040
defaultRetentionSizeInMB=0```
Any ideas what I might be doing wrong? I am producing and consuming with `pulsar-perf`
----
2020-06-09 22:19:40 UTC - Marcio Martins: Setting `defaultRetentionSizeInMB=-1` seems to stop the trimming, but now I get another error:
```Preparing metadata to offload ledger 31 with uuid e62b6d2b-7949-409f-96bd-7227aea9e8ed
Failed to prepare ledger 31 for offload, uuid e62b6d2b-7949-409f-96bd-7227aea9e8ed
org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion```
----
2020-06-09 22:20:34 UTC - Marcio Martins: Ideas?
----
2020-06-10 02:04:33 UTC - Huanli Meng: @Addison Higham, Thanks
----
2020-06-10 02:20:19 UTC - tuteng: Unload disconnects all connections on the current topic
----
2020-06-10 07:54:31 UTC - Vil: Not sure if this the right place to ask, but will Pulsar Summit talks be recorded and shared afterwards?
----
2020-06-10 08:40:41 UTC - Asaf Mesika: Regarding this example: If the application containing the producer crashes, it needs to save the last sequence ID used in a database to make sure its monotonically increases?
----