You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pulsar.apache.org by Apache Pulsar Slack <ap...@gmail.com> on 2020/06/12 09:11:05 UTC

Slack digest for #general - 2020-06-12

2020-06-11 09:12:03 UTC - jujugrrr: Oky, maybe it worth raising an issue on gitub? @Addison Higham is it something you had to cover for your talk on Pulsar + K8S, were you using EKS?
----
2020-06-11 10:40:14 UTC - jujugrrr: Hi all. I'm testing pulsar offloading to S3. I have a script producing 1M messages and another one reading them. The reading works well a few times (I re-run from scratch) but then I start to get exceptions:
```10:28:56.701 [pulsar-io-24-1] INFO  org.apache.bookkeeper.mledger.impl.ManagedCursorImpl - [ten/ns/persistent/my-topic-reader-3558c16521] Rewind from 233:0 to 233:0
10:28:56.701 [pulsar-io-24-1] INFO  org.apache.pulsar.broker.service.persistent.PersistentTopic - [<persistent://ten/ns/my-topic>] There are no replicated subscriptions on the topic
10:28:56.701 [pulsar-io-24-1] INFO  org.apache.pulsar.broker.service.persistent.PersistentTopic - [<persistent://ten/ns/my-topic>][reader-3558c16521] Created new subscription for 0
10:28:56.701 [pulsar-io-24-1] INFO  org.apache.pulsar.broker.service.ServerCnx - [/x.x.x.x7:53908] Created subscription on topic <persistent://ten/ns/my-topic> / reader-3558c16521
10:28:56.705 [bookkeeper-ml-workers-OrderedExecutor-6-0] WARN  org.apache.bookkeeper.mledger.impl.OpReadEntry - [ten/ns/persistent/my-topic][reader-3558c16521] read failed from ledger at position:233:0 : Unknown exception
10:28:56.705 [broker-topic-workers-OrderedScheduler-3-0] ERROR org.apache.pulsar.broker.service.persistent.PersistentDispatcherSingleActiveConsumer - [<persistent://ten/ns/my-topic> / reader-3558c16521-Consumer{subscription=PersistentSubscription{topic=<persistent://ten/ns/my-topic>, name=reader-3558c16521}, consumerId=0, consumerName=, address=/x.x.x.x7:53908}] Error reading entries at 233:0 : Unknown exception - Retrying to read in 15.0 seconds```
Those ledgers are getting offloaded to S3. It looks like as soon as the ledger is removed(set-offload-deletion-lag) from the local storage I'm getting the exception above.
```10:28:28.452 [bookkeeper-ml-workers-OrderedExecutor-6-0] INFO  org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl - [ten/ns/persistent/my-topic] End TrimConsumedLedgers. ledgers=3 totalSize=38942923
10:28:28.452 [bookkeeper-ml-workers-OrderedExecutor-6-0] INFO  org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl - [ten/ns/persistent/my-topic] Deleting offloaded ledger 233 from bookkeeper - size: 15432415
10:28:28.452 [bookkeeper-ml-workers-OrderedExecutor-6-0] INFO  org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl - [ten/ns/persistent/my-topic] Deleting offloaded ledger 234 from bookkeeper - size: 15504438
- size: 16168356```
Also I can see the Ledgers got removed from Zookeeper. Is there a configuration option I'm missing?  Is there a way to understand why `read failed from ledger at position:233:0 : Unknown exception` is happening? Thank you!

I would expect to still be able to read the messages from the offloaded ledgers
----
2020-06-11 14:41:11 UTC - Arnaud Briche: Hi all,

I'm trying to deploy pulsar standalone on minikube, and it seems like I can only produce/consume message from the same container as pulsar.
Using kubernetes service dns name does not works.
----
2020-06-11 15:22:33 UTC - Addison Higham: @Marcio Martins @jujugrrr S3 offloading is NOT built on top of the AWS SDK, it uses jclouds. OIDC requires a new credential provider. jclouds will need to get support for OIDC. We started in on Pulsar before OIDC was a thing, so we implemented per-pod IAM with kiam. We also support OIDC but can't fully migrate to it until everything supports OIDC auth
----
2020-06-11 15:24:54 UTC - jujugrrr: makes sense
----
2020-06-11 16:00:07 UTC - Addison Higham: oh actually, I just looked into this more @jujugrrr all that needs to happen is we need to bump the aws-java-sdk version as jclouds does use the credential provider chain from aws-sdk
----
2020-06-11 16:00:24 UTC - jujugrrr: ah, sounds good
----
2020-06-11 16:01:29 UTC - jujugrrr: definitely a great feature for all the EKS based deployment
----
2020-06-11 16:01:55 UTC - Marcio Martins: Yep, that's what I thought from that github issue. Will it make it for 2.6.0?
----
2020-06-11 16:03:11 UTC - Matthew Follegot: Hi folks, I have a question regarding using Pulsar as a message queue.

What happens if I have a small, finite number of consumers and a large ingestion rate into a Pulsar topic? I understand that Pulsar persists unacknowledged messages to BookKeeper, but what I don't understand is if these events will automatically be fetched from BookKeeper at a later time or if they will sit there indefinitely until they are manually consumed.

Any help is appreciated! Thank you :)
----
2020-06-11 16:03:17 UTC - Addison Higham: the sdk version was bumped already for an unrelated feature
----
2020-06-11 16:03:28 UTC - Addison Higham: and is in 2.6.0
----
2020-06-11 16:04:54 UTC - Gary Fredericks: by "fetched from" do you mean "deleted from"?
----
2020-06-11 16:06:06 UTC - jujugrrr: I think that's what you are looking at <http://pulsar.apache.org/docs/en/cookbooks-retention-expiry/#backlog-quotas>
----
2020-06-11 16:07:17 UTC - Matthew Follegot: I mean consumed and deleted from
----
2020-06-11 16:07:39 UTC - Matthew Follegot: Thank you, I'll look into this!
----
2020-06-11 16:16:00 UTC - Marcio Martins: Top! Thx
----
2020-06-11 17:04:14 UTC - Nicolas Ha: I can now see the page with a v3 selector :slightly_smiling_face:  thanks
----
2020-06-11 20:55:34 UTC - Marcio Martins: Hey guys, I have pumped 26M messages to pulsar with retention on, and offloaded almost everything to S3. Now I have 3 readers consuming, but every few seconds I get this exception:
```20:49:14.325 [bookkeeper-ml-workers-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.common.util.SafeRunnable - Unexpected throwable caught 
java.lang.NullPointerException: null
	at org.apache.bookkeeper.mledger.impl.OpReadEntry.lambda$readEntriesFailed$0(OpReadEntry.java:88) ~[org.apache.pulsar-managed-ledger-2.5.1.jar:2.5.1]
	at org.apache.bookkeeper.mledger.util.SafeRun$1.safeRun(SafeRun.java:32) ~[org.apache.pulsar-managed-ledger-2.5.1.jar:2.5.1]
	at org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36) [org.apache.bookkeeper-bookkeeper-common-4.10.0.jar:4.10.0]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_242]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_242]
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.45.Final.jar:4.1.45.Final]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_242]```
Does anybody know if my readers are still getting every message correctly, or actually, what is the impact of this exception?
Secondly, the rate at which it's consuming is quite slow at ~150Mbit/s. At one point the first reader which I started earlier than the other 2, was starved for 1 minute with no messages at all. I assume this is due to fetching the offloaded ledgers from S3, but even so, 1 minute with no messages seems like something else is going on...
----
2020-06-11 21:43:23 UTC - Jeff Schneller: I created a tenant called `my-tenant` and a namespace called `dev` using the pulsar-admin cli.  When viewing in the pulsar-admin cli I can see the tenant and namespace.  I can also do `pulsar-admin namespaces policies my-tenant/dev` and I get a result back.   Now when I use a java client to create a producer to a topic that doesn't exist (I expect the topic to be created) using this code:
`client.newProducer(org.apache.pulsar.client.api.Schema.JSON(MyObject.class)).topic("<persistent://my-tenant/dev/my-topic>").create();`
I am getting an error of:
`org.apache.pulsar.client.api.PulsarClientException$BrokerMetadataException: Policies not found for my-tenant/dev namespace`

Any ideas on what is going on?  What did I miss in the namespace creation?
----
2020-06-11 21:44:53 UTC - Frank Kelly: Did you turn on `authorizationEnabled=true` ?
----
2020-06-11 21:45:47 UTC - Jeff Schneller: yes it is turned on.  Do I need to set role access?

I thought if I didn't have any role access then all were allowed.
----
2020-06-11 21:47:13 UTC - Frank Kelly: No I _*think*_ if you turn on default Authorization Plugin then basically you have to authorize access to each tenant/namespace/topic - sorry I don't more - still learning about Auth*n myself. Possibly @Sijie Guo can be definitive?
----
2020-06-11 21:49:23 UTC - Jeff Schneller: Ok I can try that.  I am still learning myself.  I was hoping to set the authorize access at the topic level only so I don't give all roles produce/consume at the namespace and then forget for a certain topic.  I can certainly grant produce consume for all roles at the namespace and then produce or consume at the topic level
----
2020-06-11 21:50:44 UTC - Sijie Guo: what did you get `pulsar-admin namespaces policies my-tenant/dev`?
----
2020-06-11 21:52:27 UTC - Jeff Schneller: `{`
  `"auth_policies" : {`
    `"namespace_auth" : { },`
    `"destination_auth" : { },`
    `"subscription_auth_roles" : { }`
  `},`
  `"replication_clusters" : [ "pulsar-cluster-1" ],`
  `"bundles" : {`
    `"boundaries" : [ "0x00000000", "0x40000000", "0x80000000", "0xc0000000", "0xffffffff" ],`
    `"numBundles" : 4`
  `},`
  `"backlog_quota_map" : {`
    `"destination_storage" : {`
      `"limit" : -1073741824,`
      `"policy" : "producer_request_hold"`
    `}`
  `},`
  `"clusterDispatchRate" : { },`
  `"topicDispatchRate" : {`
    `"pulsar-cluster-1" : {`
      `"dispatchThrottlingRateInMsg" : 0,`
      `"dispatchThrottlingRateInByte" : 0,`
      `"relativeToPublishRate" : false,`
      `"ratePeriodInSecond" : 1`
    `}`
  `},`
  `"subscriptionDispatchRate" : {`
    `"pulsar-cluster-1" : {`
      `"dispatchThrottlingRateInMsg" : 0,`
      `"dispatchThrottlingRateInByte" : 0,`
      `"relativeToPublishRate" : false,`
      `"ratePeriodInSecond" : 1`
    `}`
  `},`
  `"replicatorDispatchRate" : { },`
  `"clusterSubscribeRate" : {`
    `"pulsar-cluster-1" : {`
      `"subscribeThrottlingRatePerConsumer" : 0,`
      `"ratePeriodInSecond" : 30`
    `}`
  `},`
  `"publishMaxMessageRate" : { },`
  `"latency_stats_sample_rate" : { },`
  `"message_ttl_in_seconds" : 0,`
  `"deleted" : false,`
  `"encryption_required" : false,`
  `"subscription_auth_mode" : "None",`
  `"max_producers_per_topic" : 0,`
  `"max_consumers_per_topic" : 0,`
  `"max_consumers_per_subscription" : 0,`
  `"compaction_threshold" : 0,`
  `"offload_threshold" : -1,`
  `"schema_auto_update_compatibility_strategy" : "Full",`
  `"schema_compatibility_strategy" : "UNDEFINED",`
  `"is_allow_auto_update_schema" : true,`
  `"schema_validation_enforced" : false`
`}`
----
2020-06-11 21:53:15 UTC - Jeff Schneller: I just tried to set the permissions and received "Authorization is not enabled"  so I guess I don't have authorization turned on.  It should have been.
----
2020-06-11 22:05:19 UTC - Jeff Schneller: Before I turn on authorization, I would like to understand why I can't create a subscriber without authorization.
----
2020-06-11 23:01:05 UTC - Sijie Guo: Is this standalone or a cluster?
----
2020-06-11 23:03:43 UTC - Jeff Schneller: Cluster but only one broker is turned on right now.  
----
2020-06-12 01:17:28 UTC - Jeff Schneller: I am an idiot.  I had a typo in the tenant name.  I really need to get some glasses.
----
2020-06-12 01:18:42 UTC - Luke Stephenson: Is there an image built from master which has the fix?
----
2020-06-12 01:20:19 UTC - Luke Stephenson: Does anyone have S3 offloading working with a region other than us-east-1? Would love to know how you got it working
----
2020-06-12 01:21:28 UTC - Luke Stephenson: I'm blocked in this issue <https://github.com/apache/pulsar/issues/3833|https://github.com/apache/pulsar/issues/3833>
----
2020-06-12 01:28:17 UTC - Penghui Li: @Luke Stephenson 2.6.0 rc is out and will be released soon
----
2020-06-12 05:13:11 UTC - Anup Ghatage: @Anup Ghatage has joined the channel
----
2020-06-12 06:04:46 UTC - Sijie Guo: @Luke Stephenson - What is your configuration?
----
2020-06-12 06:05:28 UTC - Sijie Guo: @xiaolong.ran - did verify S3 offloading using 2.5.1 before.
----
2020-06-12 07:54:42 UTC - Daniel Ciocirlan: @Daniel Ciocirlan has joined the channel
----
2020-06-12 07:57:32 UTC - Daniel Ciocirlan: hey, question, if we have 4 geo-replicated standalone pulsar clusters in different regions, one region is producing and other regions are consuming, do i have to create the tenant/namespace/partitioned topic in every region, or just in the producing region. Is the tenant/namespace/partitioned-topic information replicated to other regions ?
----
2020-06-12 08:39:13 UTC - Dhakshin: Hi

I am running standalone pulsar (on my laptop as a single node).
I want to see below pulsar functions related metrics in prometheus. Which property and file need modified?

*pulsar_function_last_invocation*
*pulsar_function_system_exceptions_total*
*pulsar_function_user_exceptions_total*
*pulsar_function_process_latency_ms*
*pulsar_function_received_total*
*pulsar_function_processed_successfully_total*

I referred below link [topics as Pulsar Functions]
<http://pulsar.apache.org/docs/en/next/reference-metrics/>
----