You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pulsar.apache.org by Apache Pulsar Slack <ap...@gmail.com> on 2018/07/06 09:11:04 UTC

Slack digest for #general - 2018-07-06

2018-07-05 10:36:57 UTC - Chris Miller: I've just seen the most recent reply to this Kafka vs Pulsar question on StackOverflow and was wondering if anyone could comment on the Pulsar cons, many of which I haven't heard before (the points about messageId concept, read last message, operational complexity, latency):
<https://stackoverflow.com/questions/46048608/what-are-the-advantages-and-disadvantages-of-kafka-over-apache-pulsar/47477765>
----
2018-07-05 10:51:34 UTC - Julien Nioche: hi, is there a mechanism within Pulsar or Bookkeeper which I could use to find whether a message with a particular ID has already been seen? At the moment I am relying on Redis but if there is a way of doing this 100% within Pulsar that would be even better. This is not like compaction as I don't want to keep the latest message but the want to discard any messages with a particular status if it has been seen before. That would not be for all the messages, just some of them.
----
2018-07-05 11:44:23 UTC - Byron: has anyone successfully built a static Go binary against the libpulsar static library? i am doing the standard `go build -ldflags '-extldflags "-static"'` command, but i am getting a slew of `undefined reference` errors. omitting the `-ldflags` (and thus linking as a shared library works).
----
2018-07-05 14:06:26 UTC - Grant Wu: out of curiosity, @Matteo Merli do you know what the reasoning behind putting that RPC in the Admin API
----
2018-07-05 14:06:42 UTC - Grant Wu: It seems like doing so means that you have to take special care with authn/authz
----
2018-07-05 14:33:25 UTC - Josh West: @Byron Our developers here are having lots of problems with the Golang client too
----
2018-07-05 14:43:09 UTC - Byron: @Josh West Had you tried this native client at all? <https://github.com/t2y/go-pulsar>
----
2018-07-05 15:03:33 UTC - Beast in Black: @Matteo Merli thanks for your reply
----
2018-07-05 15:08:03 UTC - Matteo Merli: Just saw the answer, will reply here and on SO
----
2018-07-05 15:57:21 UTC - Matteo Merli: &gt; has anyone successfully built a static Go binary against the libpulsar static library? i am doing the standard `go build -ldflags '-extldflags "-static"'` command, but i am getting a slew of `undefined reference` errors. omitting the `-ldflags` (and thus linking as a shared library works).

@Byron I haven’t been able to get the static binary to work yet. I’m not sure what’s the issue though. For now I’d recommend to stick with the RPM/Deb packages with the runtime lib

@Josh West What kind of errors are you seeing?
----
2018-07-05 16:05:23 UTC - Byron: @Matteo Merli Ok thanks
----
2018-07-05 16:30:03 UTC - Byron: running the pulsar-dashboard image i am getting the error

```
2018-07-05 16:27:07,805 INFO spawnerr: can't find command '/usr/lib/postgresql/9.4/bin/postgres'
2018-07-05 16:27:07,805 INFO success: uwsgi entered RUNNING state, process has stayed up for &gt; than 1 seconds (startsecs)
2018-07-05 16:27:09,920 INFO spawnerr: can't find command '/usr/lib/postgresql/9.4/bin/postgres'
2018-07-05 16:27:13,214 INFO spawnerr: can't find command '/usr/lib/postgresql/9.4/bin/postgres'
2018-07-05 16:27:13,214 INFO gave up: postgres entered FATAL state, too many start retries too quickly
```
----
2018-07-05 16:30:33 UTC - Byron: for 2.0.1
----
2018-07-05 16:31:21 UTC - Matteo Merli: Yes, someone else pointed out. the postgres version inside the docker image was update to 9.6 but the script is still trying pointing to 9.4
----
2018-07-05 16:34:28 UTC - Byron: 2.0.0-rc1-incubating tag works fine
----
2018-07-05 16:39:32 UTC - Matteo Merli: Ok, then the image was created before the upgrade in the Debian base image happened
----
2018-07-05 16:39:39 UTC - Matteo Merli: I created a PR to fix this <https://github.com/apache/incubator-pulsar/pull/2088>
----
2018-07-05 16:39:39 UTC - Byron: Yep
----
2018-07-05 16:39:44 UTC - Byron: Or the postgresql package
----
2018-07-05 16:40:48 UTC - Matteo Merli: (I guess part of the issue is depending on `python:2.7` as a starting point rather than, say `debian:9`
----
2018-07-05 16:41:01 UTC - Byron: I noticed the package name in the `apt-get` command doesn’t pin any versions
----
2018-07-05 16:41:12 UTC - Byron: so the version can change each time the image is built
----
2018-07-05 16:41:18 UTC - Byron: (in theory)
----
2018-07-05 17:06:05 UTC - Matteo Merli: @Chris Miller Replying to <https://stackoverflow.com/a/51106433/8420334> : 

&gt; messageId concept tied to BookKeeper - consumers cannot easily position itself on the topic compared to Kafka offset which is continuous sequence of numbers.

Consumers can use messageId to position on any message. MessageId can also be stored outside Pulsar and
used to rollback to specific message.

&gt; Reader cannot easily read last message in the topic - need to skim through all the messages to the end.

Reader can specify MessageId.latest to position itself to end of stream 

&gt; higher operational complexity - Zookeeper + Broker nodes + BookKeeper - all clustered

For small clusters, the recommended deployment mode is to combine broker &amp; bookkeeper. That is the same
number of component as Kafka

&gt; latency questionable - there is one extra remote call between Broker node and BookKeeper (compared to Kafka)

This is not true. First, Kafka also has the extra network hop (when replicating to another broker). Second, 
Pulsar, with BookKeeper can in fact guarantee a much lower latency compared to Kafka, even offering strong
durability, compared to Kafka in-memory page cache approach. The latency, for messaging system is typically
dominated by disk access pattern, rather than network. 1 extra network hop is ~0.1 milliseconds, Pulsar 
can guarantee 99pct latency of &lt; 5ms, while Kafka would be usually in the ~15 ms (with no data durability) and 
spikes to 100s of ms (for the 99pct).

You can tests different messaging systems with OpenMessaging benchmark: <http://openmessaging.cloud/docs/benchmarks/>
100 : Byron
+1 : Chris Miller
----
2018-07-05 17:10:38 UTC - Daniel Ferreira Jorge: I'm having some pretty hard time trying to not lose messages *and* not have them published out of order (for a partition) when there is an abrupt failure between the java process and the broker (stopping standalone pulsar docker container). 

What I have here is one partitioned topic with 1024 partitions, being consumed by one consumer and produced to another non partitioned topic. Everything is synchronous and pretty standard. I instantiate a producer and a consumer, inside a loop I call consumer.receive(), then, producer.send() and then, consumer.acknowledge(). I have about 10K messages in the topic being consumed. 

The procedure I'm doing here is start the java process, then it consumes about 3K messages and I stop the pulsar container and leave the java process running. After about 1 minute I restart the container and the messages consumption is resumed *but* it does not consume all of the messages. To consume the remaining messages, I have to stop and restart the java process. 

I'm pretty sure that my completely grotesque java skills are in the way. I'm probably missing some configuration either on the producer or on the consumer... Does anybody have some code sample that allows me to consume a partitioned topic and produce to another topic in a way that is resilient to failures?
----
2018-07-05 17:14:19 UTC - Matteo Merli: @Daniel Ferreira Jorge I think you might be getting exceptions during the `consumer.acknowledge()` part while the standalone container is stopped. You should be either catching those or just ignore them. Eg: you could use `consumer.acknowledgeAsync()` and don’t check the result (if ack fails, the message will be redelivered anyway)
----
2018-07-05 17:32:03 UTC - Grant Wu: Is this normal?
----
2018-07-05 17:32:06 UTC - Grant Wu: ```
13:31:12.085 [main] INFO  org.apache.bookkeeper.bookie.Bookie - Stamping new cookies on all dirs [data/standalone/bookkeeper0/current] [data/standalone/bookkeeper0/current]
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.apache.bookkeeper.shaded.com.google.protobuf.UnsafeUtil (file:/Users/grant.wu/apache-pulsar-2.0.1-incubating/lib/org.apache.bookkeeper-bookkeeper-server-shaded-4.7.0.jar) to field java.nio.Buffer.address
WARNING: Please consider reporting this to the maintainers of org.apache.bookkeeper.shaded.com.google.protobuf.UnsafeUtil
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
```
----
2018-07-05 17:32:41 UTC - Matteo Merli: @Grant Wu is that with JDK 9 / 10?
----
2018-07-05 17:33:17 UTC - Grant Wu: Yeah
----
2018-07-05 17:33:22 UTC - Grant Wu: Should I install something older?
----
2018-07-05 17:34:08 UTC - Grant Wu: ah yes that’s what the docs say
----
2018-07-05 17:38:39 UTC - Daniel Ferreira Jorge: @Matteo Merli thanks for the answer. You said that "if ack fails, the message will be redelivered anyway" but this redelivery will respect the order of messages within a partition? Because what is happening is this: From the ~10K messages of the topic I consume ~2K messages, stop the container and start it again, then, another ~5K messages are consumed,  Leaving ~3K messages to be redelivered, which I assume are the failures to ack from when the container were stopped. By either calling `redeliverUnacknowledgedMessages()` or stopping and restarting the java process, I can get those remaining ~4K messages redelivered, but they will be out of order because when the pulsar container came back, the consumption did not resume from where it stopped.
----
2018-07-05 17:45:45 UTC - Matteo Merli: &gt;  but this redelivery will respect the order of messages within a partition?

Yes, when the client reconnects it would receive all unacknowledged messages. If you are using exclusive/failover subscription type, all messages are guaranteed to be replayed in order (per partition)
----
2018-07-05 17:46:58 UTC - Matteo Merli: &gt;Leaving ~3K messages to be redelivered, which I assume are the failures to ack from when the container were stopped
Yes, if the exception on `consumer.acknowledge()` is not caught, it would be probably make your app to stop the receive() loop
----
2018-07-05 17:52:22 UTC - Grant Wu: This is somewhat of a silly question, but:
in one terminal I am running `for i in {1..10000}; do ./pulsar-client produce TestTopic -m "Message $i"; sleep 10; done`
in another I am running
`./pulsar-client consume -n 0 -s Subscription1 TestTopic`
and in the consuming side I get
```
----- got message -----
Message 12
13:51:04.173 [main] WARN  org.apache.pulsar.client.cli.PulsarClientTool - No message to consume after waiting for 20 seconds.
13:51:09.174 [main] WARN  org.apache.pulsar.client.cli.PulsarClientTool - No message to consume after waiting for 20 seconds.
```
----
2018-07-05 17:52:52 UTC - Grant Wu: Which is weird because I’m sleeping for 10 seconds?
----
2018-07-05 17:53:55 UTC - Matteo Merli: yes, the warn message is wrong. That was already fixed (or removed I think) in master code
----
2018-07-05 17:54:01 UTC - Grant Wu: ah, okay
----
2018-07-05 18:12:57 UTC - Daniel Ferreira Jorge: @Matteo Merli Inside the loop I do `catch (PulsarClientException e)` and I'm using an exclusive subscription.
----
2018-07-05 18:29:18 UTC - Rasty Turek: Is there a way to add a delay to a message? Something like an opposite to a TTL, start delivering this messages at this time, or after this time. The only thing that I could think of was to put a sleep into a function but not sure what the consequences of that would be
----
2018-07-05 18:39:23 UTC - Grant Wu: Would it be worth making some more channels?  e.g. #performance or #newb
----
2018-07-05 18:58:56 UTC - Grant Wu: Is this another cosmetic bug?
----
2018-07-05 18:59:00 UTC - Grant Wu: ```
grant.wu@MBP00245s-MacBook-Pro:~/apache-pulsar-2.0.1-incubating/bin ⇒ ./pulsar-admin namespaces list
Main parameters are required ("tenant-name
")
```
----
2018-07-05 18:59:07 UTC - Grant Wu: namely the newline after tenant-name
----
2018-07-05 19:28:46 UTC - Byron: @Grant Wu you are missing the last argument to the command which is the tenant name
----
2018-07-05 19:28:58 UTC - Grant Wu: Yes I’m aware…
----
2018-07-05 19:29:15 UTC - Grant Wu: I’m saying that the fact that `")` got put on the next line
----
2018-07-05 19:29:22 UTC - Grant Wu: Might be a cosmetic bug
----
2018-07-05 19:30:33 UTC - Grant Wu: <https://github.com/apache/incubator-pulsar/blob/b2a373d903c183c95dd5e29b317ef03af9bf81ff/pulsar-client-tools/src/main/java/org/apache/pulsar/admin/cli/CmdNamespaces.java#L49>
----
2018-07-05 19:30:42 UTC - Grant Wu: Yes it is strange that there is an explicit `\n` here
----
2018-07-05 19:30:46 UTC - Grant Wu: to me, at least
----
2018-07-05 19:35:09 UTC - Grant Wu: Anyways, a more important question: I’m having trouble getting reset-cursor to work.
I’m running the following concurrently:
`./pulsar-client consume -n 0 -s Subscription1 TestTopic`
`./pulsar-client consume -n 0 -s Subscription2 TestTopic`
`for i in {1..10000}; do ./pulsar-client produce TestTopic -m "Message $(date)"; sleep 1; done`
----
2018-07-05 19:35:35 UTC - Grant Wu: And then in a fourth, I’m running ```./pulsar-admin persistent reset-cursor \
  --subscription Subscription2 --time 1 \
  <persistent://public/default/TestTopic>```
----
2018-07-05 19:36:03 UTC - Grant Wu: But it doesn’t seem to affect the messages consumed by Subscription2
----
2018-07-05 19:36:13 UTC - Byron: Ah I misread that as Slack wrapping those characters..
----
2018-07-05 19:36:22 UTC - Grant Wu: Am I misunderstanding how `reset-cursor` is supposed to work?
----
2018-07-05 19:36:47 UTC - Grant Wu: I’ve verified that I’ve got the right tenant and namespace using `peek-messages`
----
2018-07-05 19:37:11 UTC - Grant Wu: (with the Subscription2 consumer not running)
----
2018-07-05 19:41:39 UTC - Sijie Guo: @Grant Wu 

&gt; Might be a cosmetic bug

yeah. seem so. do you want to file a bug or send a pull request for fixing that? “\n” seems to be placed there by mistake
----
2018-07-05 19:42:42 UTC - Grant Wu: Weirdly enough, there are a lot of `\n` there
----
2018-07-05 19:43:46 UTC - Sijie Guo: hmm that’s true. maybe @Matteo Merli has a better idea why there are a lot of `\n`.
----
2018-07-05 19:44:29 UTC - Matteo Merli: Not sure, I think it might also be coming from JCommander library
----
2018-07-05 19:45:30 UTC - Matteo Merli: oh.. I see now..
----
2018-07-05 19:45:55 UTC - Matteo Merli: I guess it was a trick to have spacing between subcommonds. eg: 

```
    tenants      Operations about tenants
      Usage: tenants [options]

    brokers      Operations about brokers
      Usage: brokers [options]
```
----
2018-07-05 19:46:08 UTC - Matteo Merli: the empty line in the middle for readability
----
2018-07-05 19:46:14 UTC - Sijie Guo: &gt; I’ve verified that I’ve got the right tenant and namespace using `peek-messages`
(with the Subscription2 consumer not running)

when do you stop the subscription 2?
----
2018-07-05 19:47:15 UTC - Grant Wu: uh, no specific time.  When I try to run `reset-cursor`, the subscription 2 consumer is running
----
2018-07-05 19:48:31 UTC - Grant Wu: It’s not in other files that look similar, though, like <https://github.com/apache/incubator-pulsar/blob/1334005eda24564971343c42d0df3102fbe40039/pulsar-client-tools/src/main/java/org/apache/pulsar/admin/cli/CmdTenants.java>
----
2018-07-05 19:48:37 UTC - Grant Wu: was this trick specific to this file?
----
2018-07-05 19:54:12 UTC - Grant Wu: wait, hold on
----
2018-07-05 19:54:22 UTC - Grant Wu: I think I might have derped and not noticed the reset
----
2018-07-05 19:54:30 UTC - Grant Wu: because I had un rate limited consuming
----
2018-07-05 20:01:38 UTC - Grant Wu: Yeah, I think that was the problem.  Sorry for the noise :man-facepalming:
----
2018-07-05 20:09:02 UTC - Sijie Guo: :slightly_smiling_face:
----
2018-07-05 21:22:12 UTC - Josh West: hey @Matteo Merli -- @Ryan Rose could probably direct you to the issues they've been having when building/using the official pulsar golang client.  most of the problems have stemmed from trying to get the c++ library and openssl library properly incorporated.
----
2018-07-05 21:25:51 UTC - Matteo Merli: @Ryan Rose Have you tried creating the RPMs/Deb as described in <http://pulsar.apache.org/docs/latest/clients/Cpp/#Linux-wjachl> ? Alternatively, the release candidate has the binaries already (it’s not officially released though!) <https://dist.apache.org/repos/dist/dev/incubator/pulsar/pulsar-2.1.0-incubating-candidate-4/>
----
2018-07-05 21:26:29 UTC - Josh West: @Matteo Merli looks like ryan stepped out for the day so i'm sure he'll get back to you tomorrow
+1 : Matteo Merli
----
2018-07-05 21:41:16 UTC - Ryan Rose: @Josh West Hi! I've been able to get the c++ packages working (via the deb script) there was just that issue with OpenSSL a while back. The only problem now is that that deb package doesn't work with the go client files brought in by `go get -u <http://github.com/apache/incubator-pulsar/pulsar-client-go/pulsar|github.com/apache/incubator-pulsar/pulsar-client-go/pulsar>`, some type errors in the linked header files. As long as I copy the pulsar-client-go files from master, it works! 
----
2018-07-05 21:45:08 UTC - Matteo Merli: Ok, the openssl should be fixed in the latest RPMs (and scripts to build them). 

Yes, the RPMs/DEB should be working with latest master of Go sources. 
Shouldn’t go get always fetch the code from master? One of the issues with “go get” is that it doesn’t allow to pin it to a certain tag. The solutions I’ve see were to “cd” into that folder and manually checkout a tag.
----
2018-07-05 21:48:40 UTC - Ryan Rose: I believe that is how go get should behave, but recent attempts have proven otherwise... :/ 
----
2018-07-05 21:49:50 UTC - Ryan Rose: Maybe there's something going on with dep, I'll look into that and report back. 
----
2018-07-05 21:50:53 UTC - Matteo Merli: You can try to remove `~/go/src/github.com/apache/incubator-pulsar/` and `go get` again
----
2018-07-05 23:04:09 UTC - Byron: I can confirm building the deb from master works. Did it this morning
----
2018-07-06 08:45:38 UTC - Chris Miller: Thanks very much for debunking these Merlimat. I've been evaluating Pulsar vs Kafka and a couple of these were giving me a bit of pause.
----