You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pulsar.apache.org by Apache Pulsar Slack <ap...@gmail.com> on 2020/02/15 09:11:04 UTC

Slack digest for #general - 2020-02-15

2020-02-14 10:34:28 UTC - Per Ljusback: @Per Ljusback has joined the channel
----
2020-02-14 13:05:19 UTC - jboehm: @jboehm has joined the channel
----
2020-02-14 14:56:50 UTC - Santiago Del Campo: I'm facing a weird issue:

Context: Trying to deploy (k8s yamls) pulsar-dashboard (2.5.0) inside a different namespace (not *default*, but *pulsar1*). The collector.py throws an exception with the following:
*Message: 'ERROR: '*
*Arguments: (ConnectionError(MaxRetryError("HTTPConnectionPool(host='broker.default.svc.cluster.local', port=8080): Max retries exceeded with url: /admin/v2/brokers/local (Caused by NewConnectionError('&lt;urllib3.connection.HTTPConnection object at 0x7ff984db63d0&gt;: Failed to establish a new connection: [Errno -2] Name or service not known'))")),)*

Reading the scripts and code around the stats collector, i can see it uses the service name for the broker (hence --&gt; <http://broker:8080>), so i don't understand why it is trying to go to the default namespace if the pulsar dashboard pod it's deployed inside a different namespace (if i do a ping to *broker* inside a pod within the same *pulsar1* namespace, it resolves correctly to the corresponding broker pod).

And what is even more weird... if i manually rewrite the complete domain for the broker pod for the correct namespace (*broker.pulsar1.svc.cluster.local*), the output it's the same, as if it's ignoring the domain name i just passed in the argument after the script (*collector.py &lt;URL_service_with_correct_namespace&gt;*).

Any ideas?
----
2020-02-14 15:58:05 UTC - Sijie Guo: isn’t your broker deployed in the default namespace? if so, then `broker.&lt;k8-namespace&gt;.svc.cluster.local` is fully qualified host name.
----
2020-02-14 16:07:52 UTC - Aaron Zhuo: Hello, does anyone know how to set up a presto cluster in k8s. I have successfully set up a pulsar cluster in my GKE cluster using these yaml files: <https://github.com/apache/pulsar/tree/master/deployment/kubernetes/google-kubernetes-engine>. But I couldn't find the instructions about setting up a presto cluster to enable pulsar SQL in k8s.
----
2020-02-14 16:09:03 UTC - Aaron Zhuo: Another question does pulsar-go-client support running pulsar SQL?
----
2020-02-14 16:41:28 UTC - Santiago Del Campo: No.. the broker it's deployed within the pulsar1 namespace, just like the rest of the cluster.
----
2020-02-14 16:50:31 UTC - Santiago Del Campo: I even deployed a pulsar-dashboard as a standalone docker outside of the k8s cluster... pointed out to the proxy, port 30001, and logs of the collector show that is trying to go to default namespace (even though proxy it's also within *pulsar1* namespace).
----
2020-02-14 16:52:14 UTC - Sijie Guo: at brokers, can you run `bin/pulsar-admin brokers list &lt;cluster-name&gt;` to get the brokers to see what addresses are advertised?
----
2020-02-14 16:53:51 UTC - Santiago Del Campo: The address advertised it's the IP+port of the Pod.
----
2020-02-14 16:54:01 UTC - Mikhail Veygman: Hi.  Is it possible to publish to Pulsar Topic without creating a Pulsar Producer?
----
2020-02-14 16:54:43 UTC - Santiago Del Campo: It's weird... the cluster it's working, so naturally all the services find each other through services names within the pulsar1 namespace.
----
2020-02-14 16:55:57 UTC - Santiago Del Campo: All of them except pulsar-dashboard, which somehow insists in going to default namespace... or at least that's what i understand :thinking_face:
----
2020-02-14 16:57:23 UTC - David Kjerrumgaard: @Mikhail Veygman You can use the Pulsar CLI tools to publish data directly to a Pulsar topic without having to write any code.   <https://pulsar.apache.org/docs/en/reference-cli-tools/#produce>
----
2020-02-14 17:03:46 UTC - Addison Higham: hrm, so I am having trouble decommissioning bookie nodes. I tried to decommission one bookie but the replication process just never finished, so I eventually just killed the node and figured autorecovery should eventually recover the data, it is a few days later and I still have 3 ledgers that are reported as under replicated
----
2020-02-14 17:05:38 UTC - Addison Higham: It seems likely that somehow those ledgers were only on a single bookie? I had some other weird operational things happen (disks got full) so that seems possible... but I it appears the ctime of the ledgers are around the time I tried to perform the decommission...
----
2020-02-14 17:12:53 UTC - Mikhail Veygman: @David Kjerrumgaard That may work from shell script but is it possible with a program?
----
2020-02-14 17:13:01 UTC - Mikhail Veygman: Like Java or C++
----
2020-02-14 17:13:17 UTC - Addison Higham: why do you not want to create a producer?
+1 : David Kjerrumgaard
----
2020-02-14 17:14:00 UTC - David Kjerrumgaard: Agreed.  That's what they are for.....any other alternative is a hack
----
2020-02-14 17:31:15 UTC - Mikhail Veygman: @Addison Higham Because I may have to create too many producers based on the topics I have to publish to.  Alternatively if a single producer can publish to multiple topics that would work as well
----
2020-02-14 17:31:50 UTC - Jordan Pilat: What is _too many producers_ ?
----
2020-02-14 17:32:40 UTC - Mikhail Veygman: 90k
----
2020-02-14 17:32:59 UTC - Jordan Pilat: Why?
----
2020-02-14 17:33:15 UTC - Mikhail Veygman: Why so many?
----
2020-02-14 17:33:20 UTC - Jordan Pilat: Right
----
2020-02-14 17:33:30 UTC - Aaron Zhuo: Can you guys talk in a thread?
+1 : Ruslan Sheremet, Sijie Guo
----
2020-02-14 17:33:36 UTC - Mikhail Veygman: BRB
wave : Jordan Pilat
----
2020-02-14 17:34:15 UTC - Jordan Pilat: How? :stuck_out_tongue:
----
2020-02-14 17:34:25 UTC - Addison Higham: okay, so one thing is producers are multiplexed across a single connection, so yes, you can usually have large numbers of producers. What has previously been suggested is to basically write a wrapper API that can publish to an arbitrary topic and under the hood to use a producer cache with size  or a ttl to close/age out some producers
----
2020-02-14 17:34:47 UTC - Jordan Pilat: I'm just kidding -- is there maybe a design support channel more geared towards this?
----
2020-02-14 17:38:41 UTC - Jordan Pilat: @Aaron Zhuo?
----
2020-02-14 17:39:22 UTC - Aaron Zhuo: Wut?
----
2020-02-14 17:39:50 UTC - Jordan Pilat: Is there a channel better suited to design support, where threads wouldn't be necessary?  Since this is a general channel
----
2020-02-14 17:42:03 UTC - Aaron Zhuo: What I mean is having a conversation within a thread makes it easy to gather all the related information. People who are interested in this topic can keep talking in this thread while people who are not interested in this topic can post other questions in the channel and start other threads.
----
2020-02-14 17:42:47 UTC - Jordan Pilat: Yeah -- I like normal channel chat, because it allows more options for communicating, is easier to search through, etc, but I don't want to clog up this channel -- what would be a better channel that doesn't require threads for each conversation?
----
2020-02-14 17:43:11 UTC - Aaron Zhuo: IDK
----
2020-02-14 17:43:38 UTC - Jordan Pilat: Alright, I'll avoid bothering you any further :wink:
----
2020-02-14 17:44:55 UTC - Aaron Zhuo: I think thread is a better way to avoid distracting other people in the same channel.
----
2020-02-14 17:54:14 UTC - tcourdy: i think the answer to this is yes based on looking at the source code but want to ask to be 100% sure.  It appears that functions can have multiple input topics but only one output topic?  Is this correct?  Similarly it seems that sources can only have one output topic correct?
----
2020-02-14 17:54:55 UTC - Addison Higham: functions can have multiple outputs as well if you use the context API
----
2020-02-14 17:57:07 UTC - Jordan Pilat: Hopefully it catches on, then!
----
2020-02-14 18:00:15 UTC - Sijie Guo: you can use `bookkeeper shell`  to troubleshoot.

1. check what ledgers are underreplicated.
2. check the ledger metadata to see what is the ensemble settings and potentially you can what topic this ledger belong to.
This will help you understand a bit of the problem.
----
2020-02-14 18:01:25 UTC - Addison Higham: okay yeah so I actually had the disk around still so I brought the node back up, it went back to no underreplicated, tried to kill it again but am back to the same 3 ledgers. How do I check ledger metadata?
----
2020-02-14 18:02:21 UTC - Addison Higham: oh NM I see it
----
2020-02-14 18:03:29 UTC - Addison Higham: oh also it wasn't the same 3 ledgers, but 3 new ledgers that all have a timestamp close to today
----
2020-02-14 18:04:24 UTC - Addison Higham: huh....
```LedgerMetadata{formatVersion=2, ensembleSize=1, writeQuorumSize=1, ackQuorumSize=1, state=CLOSED, length=175758565, lastEntryId=672, digestType=CRC32, password=base64:, ensembles={0=[pulsar-beta-bookkeeper-5.pulsar-beta-bookkeeper.pulsar.svc.cluster.local:3181]}, customMetadata={}}```
----
2020-02-14 18:04:35 UTC - Addison Higham: that would explain it... but no idea where it is coming from
----
2020-02-14 18:04:45 UTC - David Kjerrumgaard: @tcourdy Addison is correct.  Just to add some context, you can reference the following blog for a more detailed example of how to achieve multiple output topics.  <https://streaml.io/blog/eda-event-processing-design-patterns-with-pulsar-functions>
----
2020-02-14 18:06:36 UTC - Sijie Guo: Most of the pulsar client are pub/sub clients that implementing the core pulsar pub/sub messaging protocols.

For presto, I think it exposes http services. You can issue SQL queries over the http server. That will allow you interacting with presto in the programming way.

regarding setting up presto in kubernetes, currerntly it was not provided in the helm or k8s deployment scripts. Feel free to create a github issue on pulsar repo.

Since Pulsar SQL is basically just presto, you might be able to refer some presto documentation about the installations. For example: <https://docs.starburstdata.com/latest/kubernetes/overview.html>

Hope this helps!
----
2020-02-14 18:09:35 UTC - Sijie Guo: are you running sanitycheck?
----
2020-02-14 18:10:08 UTC - Sijie Guo: or simpletest with ensemble size == 1
----
2020-02-14 18:10:50 UTC - Sijie Guo: I have been recommending people avoiding using ensemble size == 1 (even it is just for sanity check).
----
2020-02-14 18:14:22 UTC - Addison Higham: no, we aren't running it anywhere automated, it is possible that somewhere we created a NS policy with just single ensemble size of 1, which certainly isn't intended! Just wish there was a way to tie it back to what was actually there easily
----
2020-02-14 18:15:58 UTC - tcourdy: okay just so im clear functions can via `Context.getUserConfigValue()` which are supplied via the cli during registration of the function.  However, sources cannot have multiple output topics since they do not have a `getUserConfigValue()` on the `SourceContext` value?  Although I guess a workaround could be having a user specifying multiple output topics in the config file that they submit with the function via the cli?
----
2020-02-14 18:19:26 UTC - tcourdy: cool thank you for this article.
+1 : David Kjerrumgaard
----
2020-02-14 18:26:22 UTC - Sijie Guo: in newer version of pulsar/bookkeeper, we actually write extra metadata information in ` customMetadata={}` to indicate which topic/partition a ledger belongs to.
----
2020-02-14 18:27:15 UTC - Sijie Guo: either you are still running an old version or the ledger is created by some other library or tools. that was the reason why I suspected sanity-check
----
2020-02-14 18:34:19 UTC - Addison Higham: this is 2.5.0...
----
2020-02-14 18:37:23 UTC - Lari Hotari: Looks a bit similar as <https://github.com/apache/pulsar/issues/6224>
----
2020-02-14 18:41:04 UTC - Mikhail Veygman: Sorry.  Just got back. :slightly_smiling_face:
----
2020-02-14 18:41:30 UTC - David Kjerrumgaard: @tcourdy You are correct. The `SourceContext`does not have the `publish` method which is the key to this capability in functions. The best you can do for now is to have the Source connector publish to a single topic, and then have a pulsar function similar to the one in the blog consume from that topic to perform the multi-topic routing.
----
2020-02-14 18:41:41 UTC - David Kjerrumgaard: A simple composite of the two should work
----
2020-02-14 18:47:15 UTC - Santiago Del Campo: Here's s screenshot of the logs:
----
2020-02-14 18:49:36 UTC - Santiago Del Campo: You can see the FQDN of a broker belonging to the default namespace (which in my cluster it does not exist, there is only a broker within the namespace *pulsar1*)... but when i want to run the collector.. i specifically put the *pulsar1* namespace:
----
2020-02-14 18:49:51 UTC - tcourdy: ahhh i see thank you very much.
----
2020-02-14 18:51:37 UTC - Santiago Del Campo: Those two screenshots were taken inside the pulsar-dashboard pod.
----
2020-02-14 18:54:31 UTC - Addison Higham: @Sijie Guo do you by chance know how much data a sanity check would be? That ledger has ~175 MB of data. Afaik, we wouldn't be running it anywhere, we are basically using the k8s helm chart with some minimal changes, but I also don't see metadata that I would expect to be in pulsar. I am wondering what else would be using BK that could do this...
----
2020-02-14 18:55:18 UTC - Sijie Guo: Oh I see. Then it is not sanity check 
----
2020-02-14 18:55:43 UTC - Sijie Guo: Functions?
----
2020-02-14 18:56:23 UTC - Sijie Guo: Are you using functions? What is your replicas setting for functions?
----
2020-02-14 19:06:30 UTC - Addison Higham: do you mean for the functions namespace? or for function state?
----
2020-02-14 19:10:44 UTC - Aaron Zhuo: Thanks for the reply!
So you are saying that I should actually use presto-client to connect to a presto cluster and execute pulsar SQL using that client?

For setting up presto cluster in k8s, I found a helm for doing that. Maybe I can tweak it to make it able to connect to a pulsar cluster and contribute it back to pulsar repo.
----
2020-02-14 20:05:30 UTC - Sijie Guo: &gt; use presto-client to connect to a presto cluster and execute pulsar SQL using that client?
yes

&gt; For setting up presto cluster in k8s, I found a helm for doing that. Maybe I can tweak it to make it able to connect to a pulsar cluster and contribute it back to pulsar repo.
sounds good.
----
2020-02-14 20:05:47 UTC - Aaron Zhuo: Thanks!
----
2020-02-14 20:06:30 UTC - Addison Higham: okay, so I validated that none of our policies for any clusters have a ensemble size of 1, our defaults are set at 3/3/2, so are you referring to the
```numFunctionPackageReplicas```
property?
----
2020-02-14 20:07:41 UTC - Addison Higham: @Sijie Guo ^^ if so, we haven't overridden that, and it doesn't appear the `WorkerConfig` sets a default
----
2020-02-14 20:08:34 UTC - Addison Higham: oh NM, we do set it to 1, okay, yeah... so that is likely it, just FYI, yes, that ledger doesn't appear to attach metadata
----
2020-02-14 20:10:54 UTC - Sijie Guo: So it seems to the function packages replicas problem. You might consider increasing it to 2 or 3 :slightly_smiling_face:

&gt; just FYI, yes, that ledger doesn’t appear to attach metadata
I think there is a change in BK side to add metadata. We will be using that change when we bump the bk version in pulsar.

--
----
2020-02-14 20:13:43 UTC - Addison Higham: the default in the yaml appears to be 1, wonder if it should be set to 2?
----
2020-02-14 20:24:50 UTC - Addison Higham: @Sijie Guo if I update that param, should it update the ensemble size and then have bookkeeper add more replicas? not sure how that works for already existing ledgers
----
2020-02-14 20:25:10 UTC - Addison Higham: oh nm, I am guessing it will roll new ledgers won't it and close the old ones
----
2020-02-14 21:30:40 UTC - Yuvaraj Loganathan: level=error msg="Failed to create producer" error="server error: ServiceNotReady: Namespace is being unloaded, cannot add topic <persistent://fooo/bar/xxx|persistent://fooo/bar/xxx> we are continuously seeing this error in client . What does this mean ?  We have enabled offloading to S3 on pulsar 2.4.2
----
2020-02-14 21:47:14 UTC - Roman Popenov: @David Kjerrumgaard @tcourdy How crazy is the idea to have a client inside the source code that can push to different topics depending on the data?
----
2020-02-14 21:48:16 UTC - Roman Popenov: If I am not mistaken, the sources and functions use the pulsar client anyway to push out to topics? Is there harm to use the client API inside the source?
----
2020-02-14 21:48:38 UTC - Roman Popenov: What would be the cons and pros?
----
2020-02-14 21:54:16 UTC - Addison Higham: @Sijie Guo sorry to keep pining you... but I am seeing problems where I need to manually restart the autorecovery worker to trigger the audit. It appears that `Resetting LostBookieRecoveryDelay value: 0, to kickstart audit task` isn't actually doing that
----
2020-02-14 21:54:34 UTC - Addison Higham: or at least, I don't see any progress until I do restart the autorecovery worker
----
2020-02-14 22:07:25 UTC - David Kjerrumgaard: @Roman Popenov I think it would a bit of overkill for this use case.
----
2020-02-14 22:09:49 UTC - Sijie Guo: &gt; the default in the yaml appears to be 1, wonder if it should be set to 2?
the default function_workers.yml is also used standalone. I think we should set the default to 2 and fix the standalone starter to always set it to 1 because standalone only starts one bookie.
----
2020-02-14 22:10:38 UTC - David Kjerrumgaard: One of the benefits of Connectors/Functions is the auto-management of the Pulsar client for you, so purposely trying to manage the client inside the code is counter-productive. Any use case that requires that you do so isn't a good candidate for a Connector or Function IMHO
----
2020-02-14 22:11:04 UTC - Sijie Guo: &gt; I am seeing problems where I need to manually restart the autorecovery worker to trigger the audit. It appears that `R`
interesting.. sounds like a bug. maybe a ticket to bookkeeper repo with some auto recovery log.
----
2020-02-14 22:11:46 UTC - Sijie Guo: sounds like a namespace bundle is being unloaded for load balancing purpose.
----
2020-02-14 22:17:04 UTC - Roman Popenov: Is there a reason why the source didn’t get an implementation of `newOutputMessage` ?
----
2020-02-14 22:25:43 UTC - Marcus Elyiace: I'm noticing when I attempt to delete a tenant that contains a namespace using the Pulsar REST API, i get this response:  {"reason":"The tenant still has active namespaces"}
Looking at the documentation for the tenant DELETE REST call, it says "Delete a tenant and all namespaces and topics under it." How would i be able to take advantage of easily deleting a tenant/namespaces/topics?
----
2020-02-14 22:31:19 UTC - David Kjerrumgaard: @Roman Popenov The Source connector uses the `read` method to produce and publish messages to the output topic. This allows the internal framework to call the same method for ALL sources. Introducing alternative methods for publishing messages would break that contract and make debugging more difficult.
----
2020-02-14 22:33:34 UTC - David Kjerrumgaard: @Marcus Elyiace The error message indicates that there are ACTIVE namespaces, meaning that you have topics inside the namespace with active producers or consumers. Therefore, the command does NOT allow you to delete the tenant (as a safety precaution) unless there are no active clients.
----
2020-02-14 22:35:56 UTC - Roman Popenov: @David Kjerrumgaard So one read is associated with one source, and this was a design choice.
----
2020-02-14 22:36:24 UTC - David Kjerrumgaard: Yes, see <https://pulsar.apache.org/docs/en/io-develop/#source>
----
2020-02-14 22:37:17 UTC - Roman Popenov: Because I was entertaining a funky idea where a source would have X threads reading X topics in Kafka, let’s say
----
2020-02-14 22:37:22 UTC - David Kjerrumgaard: there are 2 methods you need to implement, that's it.  Goal was to simplify the development.
----
2020-02-14 22:38:06 UTC - Roman Popenov: So to not have multiple or too many of the same sources and group everything into one
----
2020-02-14 22:38:12 UTC - David Kjerrumgaard: This is the current Kafka Source..... <https://github.com/apache/pulsar/blob/master/pulsar-io/kafka/src/main/java/org/apache/pulsar/io/kafka/KafkaAbstractSource.java>
----
2020-02-14 22:38:55 UTC - Roman Popenov: So if I understand correctly, to read from 3 Kafka topics, I would need to submit three different sources?
----
2020-02-14 22:39:26 UTC - David Kjerrumgaard: For Kafka, you can one KafkaSource connector per Kafka consumer group id (which is a limitation in Kafka)
----
2020-02-14 22:40:14 UTC - David Kjerrumgaard: Yes, 3 separate sources, each configured with a different topic
----
2020-02-14 22:41:24 UTC - David Kjerrumgaard: You could (in theory) extend the above code to support multiple topics, and have multiple threads (1 per topic consumer)
----
2020-02-14 22:42:00 UTC - David Kjerrumgaard: but you would have to handle scheduling of the threads, maintaining ordering of the Kafka topic data, etc.
----
2020-02-14 22:43:06 UTC - Roman Popenov: &gt; You could (in theory) extend the above code to support multiple topics, and have multiple threads (1 per topic consumer)
This was the idea I had in mind, but each topic that is being read from, it needs to be pushed into a separate pulsar topic because the idea was to group similar functionality together. When sources and functions are submitted as pods, each source is a pod. Is there a way to group those sources on one single k8s pod? I see that each source spawns a pod.
----
2020-02-14 22:43:35 UTC - David Kjerrumgaard: Also dealing with different schema versions on the Kafka topics and mapping them all to a single schema on the Pulsar side.  :smiley:
----
2020-02-14 22:44:05 UTC - Roman Popenov: Yeah, if there was three output Pulsar topics, that would mean three different independent schemas
----
2020-02-14 22:44:32 UTC - Roman Popenov: But passing by a single topics, it has to be one same type from all three Kafka topics, which isn’t a big deal ,but
----
2020-02-14 22:44:36 UTC - David Kjerrumgaard: So the configuration starts becoming problematic at that point...
----
2020-02-14 22:46:44 UTC - David Kjerrumgaard: "Is there a way to group those sources on one single k8s pod?"  Not without modifying the scheduler or tweaking some pod affinity settings.  The bigger question is why would you care where the pods were running? That kind of defeats the purpose of the pods, as they can be scheduled / restarted anywhere based on available resources.
----
2020-02-14 22:51:41 UTC - Roman Popenov: Because if there are let’s say 10 sources from kafka
----
2020-02-14 22:51:44 UTC - Roman Popenov: that’s 10 pods
----
2020-02-14 22:52:18 UTC - Roman Popenov: You can submit them as localrun, but then brokers will have to be managing it and isolation is prefered
----
2020-02-14 22:52:44 UTC - Roman Popenov: Question is more about scaling as there is limited number of IPs too
----
2020-02-14 22:53:13 UTC - Roman Popenov: So it would be good to have a connector that could read multiple topics from Kafka and push data to multiple pulsar topics
----
2020-02-14 23:07:50 UTC - David Kjerrumgaard: Those are all good points. But having one source that can read from 10 topics (each with a different schema, each with a different RBAC policy) and write to multiple Pulsar topics (each with a different schema, each with a different RBAC policy) will be a very complex piece of code that is hard to configure and debug.  It is also a single point of failure if there is an issue with one or more of the topics on either end
----
2020-02-14 23:08:10 UTC - David Kjerrumgaard: Again, it is something you can definitely write. But is does have some trade-offs . :smiley:
thinking_face : Roman Popenov
100 : Roman Popenov
----
2020-02-14 23:08:39 UTC - Roman Popenov: Perhaps in Go then :joy:
----
2020-02-14 23:11:44 UTC - Marcus Elyiace: @David Kjerrumgaard I have this issue after simply creating the namespace. I have no active topics within the namespace. As soon as I make it, I cannot delete the tenant.
----
2020-02-14 23:12:57 UTC - David Kjerrumgaard: @Marcus Elyiace That sounds like a bug. Can you please file an issue on GitHub, including steps to reproduce?
----
2020-02-14 23:13:53 UTC - Marcus Elyiace: @David Kjerrumgaard Will do, thanks!
+1 : David Kjerrumgaard
----
2020-02-14 23:14:36 UTC - Roman Popenov: @Marcus Elyiace Can you also check to see if there are topics in the namespace?
----
2020-02-14 23:15:57 UTC - Marcus Elyiace: I am no longer in the office, but I'll make sure to do that. I mainly wanted to verify if this was even possible. I was working on a clean install at the time
+1 : Roman Popenov
----
2020-02-15 02:37:30 UTC - Yuvaraj Loganathan: Thanks sijieg!
----