You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@pulsar.apache.org by Apache Pulsar Slack <ap...@gmail.com> on 2020/02/25 09:11:03 UTC

Slack digest for #general - 2020-02-25

2020-02-24 11:03:45 UTC - Konstantinos Papalias: Thanks for sharing your experiences @Devin G. Bost looking forward for the sample project and the Jira to track discussions like this
----
2020-02-24 11:38:52 UTC - Lewey: I have an issue where i have 6 consumers connecting to a partitioned topic with 18 partitions, however, the consumers only seem to be consuming from odd numbered partitions
----
2020-02-24 11:42:55 UTC - Vladimir Shchur: Hi! A question about configuration, should broker.conf configuration be synchronized between broker and bookeeper? I want to increase maxMessageSize, but increasing it just for broker (in helm k8s chart) doesn't change broker.conf for bookeeper pod and I have to use nettyMaxFrameSizeBytes to make it work. 2.5.0
----
2020-02-24 11:47:06 UTC - Roman Popenov: And just to add to that, I have set `maxMessageSize` only, and it did seem to be picked up by the system, but I also saw some exceptions like:
```13:30:42.559 [bookie-io-1-1] ERROR org.apache.bookkeeper.proto.BookieRequestHandler - Unhandled exception occurred in I/O thread or handler
io.netty.handler.codec.TooLongFrameException: Adjusted frame length exceeds 5242880: 5313378 - discarded
at io.netty.handler.codec.LengthFieldBasedFrameDecoder.fail(LengthFieldBasedFrameDecoder.java:513) ~[io.netty-netty-codec-4.1.43.Final.jar:4.1.43.Final]```
----
2020-02-24 11:50:00 UTC - Roman Popenov: And then without setting `nettyMaxFrameSizeBytes` anywhere in the config, I was able to exchange messages of roughly 40 MBs in size. Is it possible that my `maxMessageSize` is picked up, but there is some overhead that is passed along with the message and when the limit is hit, the error message is generic?
----
2020-02-24 11:50:11 UTC - Roman Popenov: <https://github.com/apache/pulsar/issues/3832>
----
2020-02-24 12:04:08 UTC - Rattanjot Singh: Any wiki to deploy pulsar on multi-cluster using kubernetes on aws?
----
2020-02-24 12:06:07 UTC - Roman Popenov: I think helm install should work
----
2020-02-24 12:06:26 UTC - Roman Popenov: If you already have set-up your kuberentes cluster
----
2020-02-24 12:07:22 UTC - Roman Popenov: Or do you mean across multiple AZs?
----
2020-02-24 12:07:39 UTC - Roman Popenov: If it’s not one single cluster, then no, there are no docs for that
----
2020-02-24 12:07:43 UTC - Rattanjot Singh: I want it in east and west
----
2020-02-24 12:08:32 UTC - Roman Popenov: I think you would have to piece things from
<http://pulsar.apache.org/docs/en/deploy-bare-metal-multi-cluster/>
----
2020-02-24 12:09:22 UTC - Rattanjot Singh: So there is no end to end wiki for kubernetes
----
2020-02-24 12:10:25 UTC - Roman Popenov: Not that I am aware of. I think it’s possible to piece things from that doc
----
2020-02-24 12:38:05 UTC - Pavel Tishkevich: Hi All,

I’ve noticed that when one of three brokers fails, unloading topics from failed broker may take approximately 1 to 5 minutes depending on number of topics.
Also I’ve noticed that during unloading sending messages to the topics that are served by alive brokers is also very slow - approaches timeout.

- Is there a way to decrease duration of topics unloading? Maybe by adding more brokers? We have approximately 200000 short-living topics on average.
- Why sending messages to the topics that are on alive brokers is so slow during unloading? Is there a way to improve this?

Thanks in advance!
----
2020-02-24 14:13:04 UTC - Steve Kim: done <https://github.com/apache/pulsar/issues/6410>
----
2020-02-24 15:09:18 UTC - Manuel Mueller: thanks for all the answers! I think it would be nice if the feature would be marked “dev-preview” to not loose too much time with it in case one is aiming to have this in production. I will replicate it and file and issue with the proper logs
----
2020-02-24 15:32:25 UTC - Devin G. Bost: Function state currently is marked “developer preview” :slightly_smiling_face:
----
2020-02-24 15:49:13 UTC - Rolf Arne Corneliussen: @Sijie Guo Thanks for the information. I have done some test (still on Windows 10), and it seems the timer wheel thread will use 100% of a hyperthread. When running 8 consumers on a i7-8700 (4 cores 2 threads per core), TaskManager reported 100% CPU usage - when there was no traffic, no messages on the topics.

I have written a simple test program, creating a `HashedWheelTimer` with 1 millisecond tick time, then adding a simple timer task that schedules itself when it times out, every 500 millisecond, and the CPU load is the same as running an idle Pulsar consumer.

My understanding of the `HashedWheelTimer` is that it is indented to be used for a large number of approximated timeouts and not for millisecond precision scheduling, but I may be mistaken.

Anyway, I tried using a `java.util.concurrent.ScheduledExecutorService`, scheduling 1000 tasks at fixed interval 1 milliseconds, and that was lighter on the CPU than the timer wheel with 1 millisecond tick time.

Should I raise an github issue on this?
----
2020-02-24 16:08:51 UTC - Sijie Guo: Yeah please create a GitHub issue.
----
2020-02-24 16:17:38 UTC - Joshua Dunham: Hey Everyone, Is commercial support offered for Pulsar/Bookkeeper? Can someone make a reccomendiation?
----
2020-02-24 16:18:01 UTC - Devin G. Bost: Please reach out to @Sijie Guo at StreamNative. He can help you out!
----
2020-02-24 16:18:24 UTC - Joshua Dunham: Nice, thx!
----
2020-02-24 16:30:51 UTC - Tanner Nilsson: @Chris Bartholomew and <http://kafkaesque.io|kafkaesque.io> have been great for us!
----
2020-02-24 16:50:27 UTC - Sijie Guo: @Joshua Dunham yes. we offer different type of services (developer support, managed service support, and etc) for pulsar and bookkeeper.
----
2020-02-24 16:53:28 UTC - Sijie Guo: MaxMessageSize requires configuring two places, one in broker `maxMessageSize` and the other one is `nettyMaxFrameSizeBytes`.
----
2020-02-24 16:55:54 UTC - Sijie Guo: `nettyMaxFrameSizeBytes` is also needed to be set on bookie side.
+1 : Roman Popenov
----
2020-02-24 16:58:40 UTC - Sijie Guo: unloading is closing the topics of that bundle. and open those topics when loading the bundle. closing the topics requires metadata updates. If you have many topics within a bundle, it takes time. You can consider increasing the number of bundles.
----
2020-02-24 17:07:14 UTC - Vladimir Shchur: Thank you for the confirmation, but these lines tell us that it should be taken from broker config, are they eventually ignored? <https://github.com/apache/pulsar/blob/master/pulsar-broker/src/main/java/org/apache/pulsar/broker/BookKeeperClientFactoryImpl.java#L106>
----
2020-02-24 17:13:14 UTC - Charmaine Keck: @Charmaine Keck has joined the channel
----
2020-02-24 17:16:05 UTC - Sijie Guo: there are two settings
----
2020-02-24 17:16:19 UTC - Sijie Guo: one in the client, the other one is the bookie
----
2020-02-24 17:16:51 UTC - Sijie Guo: the client one is taken from the broker configuration, and the server-side one is here: <https://github.com/apache/bookkeeper/blob/master/conf/bk_server.conf#L201>
----
2020-02-24 17:20:39 UTC - Pavel Tishkevich: We have 4 namespaces each having 1024 bundles. So I believe when 1 of 3 brokers fails, around 1300 bundles need to be unloaded and acquired by remaining 2 brokers.

This still proves to be quite slow and two issues described above occur.
Maybe something else may help here?
----
2020-02-24 17:27:02 UTC - Sijie Guo: Oh I see. If you only have 3 brokers, it means if one broker down, about 70k topics are impacted. It will take some time to get those 70k re-owned by the other two brokers.
----
2020-02-24 17:28:32 UTC - Sijie Guo: Currently there is no good way to get around this. For a short-term solution, it is to increase the number of brokers, so each broker owns less topics. For a long-term solution, it is for the committers and contributors to figure out a solution to support this use case.
----
2020-02-24 17:35:53 UTC - Sree Vaddi: It’s happening tonight. I look forward to seeing you all.
Out of area can participate now using online link. (Will be shared before the meeting starts.) .

<https://www.meetup.com/Apache-Heron-Bay-Area/events/nglzdrybcdbwb/>
----
2020-02-24 17:53:01 UTC - Vladimir Shchur: I see, much clearer now, thank you!
----
2020-02-24 17:54:35 UTC - RAMG: @RAMG has joined the channel
----
2020-02-24 17:55:16 UTC - Pavel Tishkevich: Thanks a lot!
----
2020-02-24 18:45:11 UTC - Kirill Podkov: [Functions Question] :wave: We've deployed a Python function that consumes an input topic, in which the messages contain a partition_key, however this key is not passed to the messages within the output topic. Is this intended? We're using the default Identity SerDe, and the function simply returns the input under certain conditions without any custom classes.
----
2020-02-24 18:56:05 UTC - Sijie Guo: Yes by default it doesn’t pass properties from source message to output topic. There is a new flag added to java functions to do so. But the python function is not supported yet.
heavy_check_mark : Kirill Podkov
cry : Manuel Mueller
----
2020-02-24 19:14:42 UTC - Kiran Chitturi: @Kiran Chitturi has joined the channel
----
2020-02-24 20:10:35 UTC - Alexander Ursu: Hi, any tips on getting a Pulsar cluster set up with Docker, looking for something more resilient than the standalone version.
----
2020-02-24 20:28:44 UTC - Greg Gallagher: @Alexander Ursu - are you using Docker compose, or...?
----
2020-02-24 20:28:54 UTC - Alexander Ursu: Docker Swarm
----
2020-02-24 20:30:10 UTC - Greg Gallagher: I'm trying to do something similar for a POC of Pulsar, but to be honest I threw Docker under the bus and am just doing onto VM. There doesn't seem to be explicit instructions on this yet, and there's an issue opened:
----
2020-02-24 20:30:38 UTC - Greg Gallagher: <https://github.com/apache/pulsar/issues/5401>
----
2020-02-24 20:30:51 UTC - Greg Gallagher: (I'm not a Pulsar developer, I just joined this Slack yesterday)
----
2020-02-24 20:30:59 UTC - Greg Gallagher: Worth noting, here is how I was going to go about this:
----
2020-02-24 20:31:17 UTC - Greg Gallagher: 1. git clone <https://github.com/apache/pulsar>
----
2020-02-24 20:32:47 UTC - Greg Gallagher: 2. check under the deployment/kubernetes/generic/original directory the yml files which would be used if you deployed under Kubernets. That'll give you the environment variables (top section, ConfigMap stuff)
----
2020-02-24 20:34:11 UTC - Greg Gallagher: 3. create a stack.yml which mimic's this file. I haven't used Swarm in years and frankly wouldn't suggest it, but you're not asking for advice on this front so I'll hold back. We use Kubernetes which is hideously complicated to run on-prem in baremetal environment. If I could do it all over again I'd really suggest looking at Hashicorp Nomad
----
2020-02-24 20:34:14 UTC - Greg Gallagher: hope that's helpful!
----
2020-02-24 20:40:09 UTC - Alexander Ursu: Thanks, will look into it, been trying to decipher a lot of it myself recently and make yml files for swarm stacks. Currently sticking to Swarm because I'm in a small team and k8s seems overkill at the moment. However, Nomad has been on my radar for a bit and it seems interesting
----
2020-02-24 20:43:19 UTC - Greg Gallagher: Yeah, Nomad also lacks documentation coverage of how to properly setup a cluster, step-by-step. Everyone seems to have two modes for documentation: "run this -dev standalone thing" or "figure it out on your own" :confused: Took me a few hours this past Saturday to figure out how to create a Nomad cluster using Consul and Nomad. Wasn't anything close to learning curve of k8s, and you definitely should consider size of team when determining direction on what to manage. Unfortunately my developers beat me into a corner when I started and demanded k8s from the start. Wish I knew what I know now to tell them "nah" but at the time it seemed like what everyone was doing. k8s in the cloud is far, far better than running on prem I would suggest. Good luck!
----
2020-02-25 04:23:15 UTC - Justin Grimes: @Justin Grimes has joined the channel
----
2020-02-25 04:52:52 UTC - vanchhay: I've been wondering, "MessageId.latest" does this field refer to the latest_message in the topic?
----
2020-02-25 04:55:08 UTC - Sijie Guo: No it doesn’t
----
2020-02-25 05:07:16 UTC - Rattanjot Singh: Has anyone deployed pulsar in east and west region using kubernetes?
----
2020-02-25 05:52:48 UTC - Greg Gallagher: Basic question: I have a local (baremetal) cluster running 4 bookkeeper nodes and 4 broker nodes (separate VMs). In conf/broker.conf what is the right number to set for managedLedgerDefaultEnsembleSize and managedLedgerDefaultWriteQuorum and managedLedgerDefaultAckQuorum ? The user guide for baremetal (<https://pulsar.apache.org/docs/en/deploy-bare-metal/>) says "1" for these values, which doesn't make sense since the example configuration is similar to what I'm listing. Should it be number of bookies - 1? Like:
```# Number of bookies to use when creating a ledger
managedLedgerDefaultEnsembleSize=3

# Number of copies to store for each message
managedLedgerDefaultWriteQuorum=3

# Number of guaranteed copies (acks to wait before write is complete)
managedLedgerDefaultAckQuorum=3```
----
2020-02-25 05:55:18 UTC - Sijie Guo: I think the guide says if you are deploying a one-node cluster, you need to change those settings to 1. Otherwise the default values are good enough for most of the use cases.
----
2020-02-25 05:56:41 UTC - Greg Gallagher: it does say 1, yes, but I'm deploying a 4 node cluster (4 bookies, 4 brokers) ... is it still 1? thanks!
----
2020-02-25 06:14:54 UTC - Greg Gallagher: n/m I see I'm asking a complicated question and need to carefully read this:
----
2020-02-25 06:14:55 UTC - Greg Gallagher: <https://bookkeeper.apache.org/docs/4.10.0/development/protocol/#ensembles>
----
2020-02-25 07:23:41 UTC - Sijie Guo: ensemble size is how many bookies used for storing a ledger, write quorum size is how many copies for storing an entry, ack quorum size is how many responses to wait for confirming write success. So most of the time you can just stick to 2/2/2 (which is the default setting), if you wanna higher guarantees you can use 3/3/2.

But if you have number of nodes less than your required replications settings (for example, if you have one-node), then have you to set it to 1/1/1. Otherwise you are not able to create ledgers since you don’t have enough bookies.

I wrote an article before. You can check it out to understand what do those settings mean in bookkeeper replication. <https://streaml.io/blog/why-apache-bookkeeper>
----
2020-02-25 08:14:26 UTC - xue: pulsar 2.5.0，start broker
----