You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@pulsar.apache.org by Apache Pulsar Slack <ap...@gmail.com> on 2018/07/20 09:11:04 UTC

Slack digest for #general - 2018-07-20

2018-07-19 15:35:36 UTC - Grant Wu: @Grant Wu uploaded a file: <https://apache-pulsar.slack.com/files/UBHR9CH5E/FBURSCHF0/why_does_pulsar_standalone_just_die_.txt|Why does pulsar standalone just die?>
----
2018-07-19 15:36:14 UTC - Grant Wu: Has anyone had `./pulsar standalone` just exit after a while?
----
2018-07-19 16:24:18 UTC - Sijie Guo: Where do you run standalone? Laptop?
----
2018-07-19 16:48:03 UTC - Grant Wu: Yeah
----
2018-07-19 16:48:13 UTC - Grant Wu: Hrm might be if the computer goes to sleep….
----
2018-07-19 17:08:51 UTC - Sijie Guo: I think when the computer goes to sleep when it comes backup, sometimes we see zookeeper session expired in this case. that’s probably the reason why it exited
----
2018-07-19 17:09:11 UTC - Grant Wu: Okay, I’ll keep that in mind
----
2018-07-19 19:43:56 UTC - Grant Wu: @Grant Wu uploaded a file: <https://apache-pulsar.slack.com/files/UBHR9CH5E/FBTEMRGGL/npe.txt|NPE>
----
2018-07-19 19:47:49 UTC - Grant Wu: aha, the v2 needs to go before consumer
----
2018-07-19 19:48:01 UTC - Sijie Guo: yes
----
2018-07-19 19:48:42 UTC - Grant Wu: is there any way to get a more useful error message here
----
2018-07-19 19:50:12 UTC - Grant Wu: <https://google.github.io/guava/releases/snapshot/api/docs/com/google/common/base/Preconditions.html#checkArgument-boolean-> we could use a different overload of `checkArgument`?
----
2018-07-19 19:51:21 UTC - Sijie Guo: yes we should be able to attach an error message in checkArgument, which would provide a better error mesage
----
2018-07-19 19:51:36 UTC - Grant Wu: Okay I’ll open a PR eventually
----
2018-07-19 19:54:49 UTC - Idan: @Matteo Merli have you seen my consumer code? we still have no idea what we are doing regarding the ack errors we retrieving
----
2018-07-19 19:55:26 UTC - Idan: @Idan shared a file: <https://apache-pulsar.slack.com/files/UALJD8929/FBQAVRE76/Pulsar_Consumer_logic.java|Pulsar Consumer logic>
----
2018-07-19 20:03:04 UTC - Matteo Merli: Hi @Idan, the code looks correct, that’s a completely valid usage of API — Unfortunately I have not got the chance to try to reproduce it. My suspicion is that somehow the acks tracker gets tripped over by the acks from multiple threads and it ends up thinking some messages are not being acked and therefore asks the broker for re-delivery. Ultimately, the redelivery becomes a no-op if the messages were already marked as acknowledged by brokers.
----
2018-07-19 20:13:31 UTC - Idan: yes the thing is that re-delivery doesnt really occours
----
2018-07-19 20:13:34 UTC - Idan: just that warning
----
2018-07-19 20:14:47 UTC - Idan: btw: somehow the acks tracker gets tripped over by the acks from multiple threads
----
2018-07-19 20:15:29 UTC - Idan: we dont ack from multithreads concurrently… I mean only one thread will ack unique message.. so no multiple acks for the same message on diff threads situation can actually happen
----
2018-07-19 20:16:53 UTC - Idan: @Idan uploaded a file: <https://apache-pulsar.slack.com/files/UALJD8929/FBT632J73/acking_logic.java|acking logic>
----
2018-07-19 20:17:45 UTC - Idan: @Matteo Merli i added code snippet of how we ack messages (wrapping pulsar ack api). i confirm that i see only 1 Message acked: {} topic {} envetId {} log for each unique message that was acked.
----
2018-07-19 20:18:34 UTC - Idan: so that also confirms only 1 ack occoures and that ack was successfully before the ack timeout
----
2018-07-19 20:18:39 UTC - Matteo Merli: Sure, though all the threads are receiving/acking from same consumer instance (which, again, is a perfectly valid usage)
----
2018-07-19 20:19:08 UTC - Idan: yes i put this consumers within a map (topicName, Consumer)
----
2018-07-19 20:19:30 UTC - Idan: this way i make sure iam not initiating multiple consumers and always using the same consumer instance  on multiple threads
----
2018-07-19 20:20:28 UTC - Idan: so everytime a consumer is initiated i do it this way:
----
2018-07-19 20:20:57 UTC - Idan: @Idan uploaded a file: <https://apache-pulsar.slack.com/files/UALJD8929/FBTSK6M29/init_pulsar_consumer.java|init pulsar consumer> and commented: @Matteo Merli
----
2018-07-19 20:22:00 UTC - Idan: @Idan uploaded a file: <https://apache-pulsar.slack.com/files/UALJD8929/FBTNK8K8A/consumerwrapper.java|ConsumerWrapper>
----
2018-07-19 20:28:07 UTC - Grant Wu: @Grant Wu uploaded a file: <https://apache-pulsar.slack.com/files/UBHR9CH5E/FBUPQR287/-.txt|Untitled>
----
2018-07-19 20:28:18 UTC - Grant Wu: @Sijie Guo Might be related to the sleep issues?
----
2018-07-19 20:28:36 UTC - Grant Wu: I noticed my laptop had gone to sleep, so I ^C `./pulsar standalone` and started it back up
----
2018-07-19 20:28:38 UTC - Grant Wu: It immediately crashed
----
2018-07-19 20:34:18 UTC - Sijie Guo: oh that is a different issue. that is rocksdb problem with empty sst tables. we upgraded rocksdb version in the upcoming 2.1 release, that would address the problem. this problem is seen in standalone when standalone shutdown with no data written, so an empty sst file is produced, which trigger the bug in rocksdb.

the resolution to that is to remove the empty sst file. for standalone, it is probably simpler to remove the data directory and start freshly
pray : Grant Wu
----
2018-07-19 20:48:47 UTC - Grant Wu: thanks
----
2018-07-19 20:48:50 UTC - Grant Wu: uh, another issue…
----
2018-07-19 20:49:12 UTC - Grant Wu: @Grant Wu uploaded a file: <https://apache-pulsar.slack.com/files/UBHR9CH5E/FBV32G45U/some_sort_of_weird_dns_issue_.txt|Some sort of weird DNS issue?> and commented: @Sijie Guo Have you seen this before?
----
2018-07-19 20:53:24 UTC - Sijie Guo: hmm I think that is related to DNS resolution. sometimes problems would show up when you moving your computer around. because you are connecting to different WiFi networks.

what I usually do to specify an advertised address when starting standalone, so the hostnames are fixed to localhost or 127.0.0.1

`bin/pulsar standalone --advertised-address 127.0.0.1`
----
2018-07-19 20:55:58 UTC - Grant Wu: hrm, okay
----
2018-07-19 20:56:02 UTC - Grant Wu: I’ll keep that in mind
----
2018-07-20 00:11:53 UTC - Sijie Guo: we just start a new release candidate voting for 2.1.0 release. <http://mail-archives.apache.org/mod_mbox/pulsar-dev/201807.mbox/%3CCAO2yDyaG-BwRmfuH9g5oDrW9-Ve3eRyy5NoGXxr1cL6f6Yu3kQ%40mail.gmail.com%3E>

if you are interested in 2.1.0 release, please help us validate and vote the release candidate :slightly_smiling_face:
+1 : Igor Zubchenok
----

Re: EXTERNAL: Slack digest for #general - 2018-07-20

Posted by "Schaffert, Lowell" <lo...@lmco.com>.

2.0.1-incubating


Lowell Schaffert

lowell.schaffert@lmco.com<ma...@lmco.com>



________________________________
From: Apache Pulsar Slack <ap...@gmail.com>
Sent: Friday, July 20, 2018 3:11:04 AM
To: users@pulsar.incubator.apache.org
Subject: EXTERNAL: Slack digest for #general - 2018-07-20

2018-07-19 15:35:36 UTC - Grant Wu: @Grant Wu uploaded a file: <https://apache-pulsar.slack.com/files/UBHR9CH5E/FBURSCHF0/why_does_pulsar_standalone_just_die_.txt|Why does pulsar standalone just die?>
----
2018-07-19 15:36:14 UTC - Grant Wu: Has anyone had `./pulsar standalone` just exit after a while?
----
2018-07-19 16:24:18 UTC - Sijie Guo: Where do you run standalone? Laptop?
----
2018-07-19 16:48:03 UTC - Grant Wu: Yeah
----
2018-07-19 16:48:13 UTC - Grant Wu: Hrm might be if the computer goes to sleep….
----
2018-07-19 17:08:51 UTC - Sijie Guo: I think when the computer goes to sleep when it comes backup, sometimes we see zookeeper session expired in this case. that’s probably the reason why it exited
----
2018-07-19 17:09:11 UTC - Grant Wu: Okay, I’ll keep that in mind
----
2018-07-19 19:43:56 UTC - Grant Wu: @Grant Wu uploaded a file: <https://apache-pulsar.slack.com/files/UBHR9CH5E/FBTEMRGGL/npe.txt|NPE>
----
2018-07-19 19:47:49 UTC - Grant Wu: aha, the v2 needs to go before consumer
----
2018-07-19 19:48:01 UTC - Sijie Guo: yes
----
2018-07-19 19:48:42 UTC - Grant Wu: is there any way to get a more useful error message here
----
2018-07-19 19:50:12 UTC - Grant Wu: <https://google.github.io/guava/releases/snapshot/api/docs/com/google/common/base/Preconditions.html#checkArgument-boolean-> we could use a different overload of `checkArgument`?
----
2018-07-19 19:51:21 UTC - Sijie Guo: yes we should be able to attach an error message in checkArgument, which would provide a better error mesage
----
2018-07-19 19:51:36 UTC - Grant Wu: Okay I’ll open a PR eventually
----
2018-07-19 19:54:49 UTC - Idan: @Matteo Merli have you seen my consumer code? we still have no idea what we are doing regarding the ack errors we retrieving
----
2018-07-19 19:55:26 UTC - Idan: @Idan shared a file: <https://apache-pulsar.slack.com/files/UALJD8929/FBQAVRE76/Pulsar_Consumer_logic.java|Pulsar Consumer logic>
----
2018-07-19 20:03:04 UTC - Matteo Merli: Hi @Idan, the code looks correct, that’s a completely valid usage of API — Unfortunately I have not got the chance to try to reproduce it. My suspicion is that somehow the acks tracker gets tripped over by the acks from multiple threads and it ends up thinking some messages are not being acked and therefore asks the broker for re-delivery. Ultimately, the redelivery becomes a no-op if the messages were already marked as acknowledged by brokers.
----
2018-07-19 20:13:31 UTC - Idan: yes the thing is that re-delivery doesnt really occours
----
2018-07-19 20:13:34 UTC - Idan: just that warning
----
2018-07-19 20:14:47 UTC - Idan: btw: somehow the acks tracker gets tripped over by the acks from multiple threads
----
2018-07-19 20:15:29 UTC - Idan: we dont ack from multithreads concurrently… I mean only one thread will ack unique message.. so no multiple acks for the same message on diff threads situation can actually happen
----
2018-07-19 20:16:53 UTC - Idan: @Idan uploaded a file: <https://apache-pulsar.slack.com/files/UALJD8929/FBT632J73/acking_logic.java|acking logic>
----
2018-07-19 20:17:45 UTC - Idan: @Matteo Merli i added code snippet of how we ack messages (wrapping pulsar ack api). i confirm that i see only 1 Message acked: {} topic {} envetId {} log for each unique message that was acked.
----
2018-07-19 20:18:34 UTC - Idan: so that also confirms only 1 ack occoures and that ack was successfully before the ack timeout
----
2018-07-19 20:18:39 UTC - Matteo Merli: Sure, though all the threads are receiving/acking from same consumer instance (which, again, is a perfectly valid usage)
----
2018-07-19 20:19:08 UTC - Idan: yes i put this consumers within a map (topicName, Consumer)
----
2018-07-19 20:19:30 UTC - Idan: this way i make sure iam not initiating multiple consumers and always using the same consumer instance  on multiple threads
----
2018-07-19 20:20:28 UTC - Idan: so everytime a consumer is initiated i do it this way:
----
2018-07-19 20:20:57 UTC - Idan: @Idan uploaded a file: <https://apache-pulsar.slack.com/files/UALJD8929/FBTSK6M29/init_pulsar_consumer.java|init pulsar consumer> and commented: @Matteo Merli
----
2018-07-19 20:22:00 UTC - Idan: @Idan uploaded a file: <https://apache-pulsar.slack.com/files/UALJD8929/FBTNK8K8A/consumerwrapper.java|ConsumerWrapper>
----
2018-07-19 20:28:07 UTC - Grant Wu: @Grant Wu uploaded a file: <https://apache-pulsar.slack.com/files/UBHR9CH5E/FBUPQR287/-.txt|Untitled>
----
2018-07-19 20:28:18 UTC - Grant Wu: @Sijie Guo Might be related to the sleep issues?
----
2018-07-19 20:28:36 UTC - Grant Wu: I noticed my laptop had gone to sleep, so I ^C `./pulsar standalone` and started it back up
----
2018-07-19 20:28:38 UTC - Grant Wu: It immediately crashed
----
2018-07-19 20:34:18 UTC - Sijie Guo: oh that is a different issue. that is rocksdb problem with empty sst tables. we upgraded rocksdb version in the upcoming 2.1 release, that would address the problem. this problem is seen in standalone when standalone shutdown with no data written, so an empty sst file is produced, which trigger the bug in rocksdb.

the resolution to that is to remove the empty sst file. for standalone, it is probably simpler to remove the data directory and start freshly
pray : Grant Wu
----
2018-07-19 20:48:47 UTC - Grant Wu: thanks
----
2018-07-19 20:48:50 UTC - Grant Wu: uh, another issue…
----
2018-07-19 20:49:12 UTC - Grant Wu: @Grant Wu uploaded a file: <https://apache-pulsar.slack.com/files/UBHR9CH5E/FBV32G45U/some_sort_of_weird_dns_issue_.txt|Some sort of weird DNS issue?> and commented: @Sijie Guo Have you seen this before?
----
2018-07-19 20:53:24 UTC - Sijie Guo: hmm I think that is related to DNS resolution. sometimes problems would show up when you moving your computer around. because you are connecting to different WiFi networks.

what I usually do to specify an advertised address when starting standalone, so the hostnames are fixed to localhost or 127.0.0.1

`bin/pulsar standalone --advertised-address 127.0.0.1`
----
2018-07-19 20:55:58 UTC - Grant Wu: hrm, okay
----
2018-07-19 20:56:02 UTC - Grant Wu: I’ll keep that in mind
----
2018-07-20 00:11:53 UTC - Sijie Guo: we just start a new release candidate voting for 2.1.0 release. <http://mail-archives.apache.org/mod_mbox/pulsar-dev/201807.mbox/%3CCAO2yDyaG-BwRmfuH9g5oDrW9-Ve3eRyy5NoGXxr1cL6f6Yu3kQ%40mail.gmail.com%3E>

if you are interested in 2.1.0 release, please help us validate and vote the release candidate :slightly_smiling_face:
+1 : Igor Zubchenok
----