You are viewing a plain text version of this content. The canonical link for it is here.
Posted to by Apache Pulsar Slack <> on 2020/03/26 09:11:03 UTC

Slack digest for #general - 2020-03-26

2020-03-25 09:39:40 UTC - Pierre-Yves Lebecq: When I have a producer with an Avro schema creating a topic, I can see the topic and its schema using the CLI. When I add a function on the topic, I have the following error: `java.lang.ClassCastException: org.apache.avro.generic.GenericData$Record cannot be cast to com.zenaton.engine.workflowengine.messages.DispatchTaskMessage`

Whereas if I do the exact same thing but using JSONSchema.of() instead of AvroSchema.of() for the producer when it creates the topic, everything works fine when I using a function on this topic.
2020-03-25 09:42:37 UTC - Yosi Attias: @Gilles Barbier thanks!

I found a solution to my parallelism issue, Instead of have key (event) to list of subscriptions which can cause issues.
I will have key like `Event/SubscriptionId` to it’s subscription. this way I won’t have any concurrency issue, and I can use the failover subscription mode to handle the issue.
(subscriptions I am saying above, is my application subscriptions and not pulsar subscriptions).

The only thing missing right now, is adding range read so I can read all subscriptions given event.
2020-03-25 09:50:11 UTC - xue: Does pulsar function support the spring framework?
2020-03-25 09:51:59 UTC - xue: ```ApplicationContext applicationContext = new ClassPathXmlApplicationContext("file:G:\\dubbo-consumer.xml");```
function show error: cvc-elt.1: Cannot find the declaration of element 'beans'
2020-03-25 10:00:49 UTC - Dennis Yung: I found that the WebSocket client doesn't support right now, and such support doesn't seem to come soon. Is it possible to hack it through? eg inject schema version info to the message manually, and publish the schema through REST api?
2020-03-25 14:54:28 UTC - David Kjerrumgaard: @xue I think Spring would be a bit of overkill for a Pulsar Function IMHO. Longer term, the community needs to come up with a project similar to <|spring-kafka> . Pulsar functions are intended to be lightweight, simple functions that are only a few lines long.
2020-03-25 14:59:39 UTC - David Kjerrumgaard: @Pierre-Yves Lebecq If you are using the LocalRunner, then you can configure the input schema type as shown in this snippet.
2020-03-25 15:01:42 UTC - Pierre-Yves Lebecq: @David Kjerrumgaard I’m not unfortunately. Is there a way to do the same thing when using the CLI “functions create” command?
2020-03-25 15:04:44 UTC - David Kjerrumgaard: Yes, there is a `--custom-schema-inputs` switch you can use to specify this type of information. <>
2020-03-25 15:06:25 UTC - Pierre-Yves Lebecq: I’ll have a look, thank you. :+1:
2020-03-25 16:26:21 UTC - Sijie Guo: @Pierre-Yves Lebecq I think there is an issue regarding avro. We are fixing that.
2020-03-25 16:30:16 UTC - Pierre-Yves Lebecq: @Sijie Guo All right. thanks for the heads up.
2020-03-25 17:19:42 UTC - Ashish Shinde: @Ashish Shinde has joined the channel
2020-03-25 17:25:08 UTC - Joel Cressy: Hey, I just found this comment thread on HN and i’m extremely intrigued: <>

If anyone knows the OP or can get in contact with them, can we share code examples? My exact stack is AWS, EKS and hashicorp vault, so i’d love to know how they glued together auth (piggybacked on aws iam) and various other configs.
2020-03-25 17:42:35 UTC - Adam Feldman: Email is in their HN profile: <>
2020-03-25 17:48:56 UTC - Addison Higham: hi, that is me :slightly_smiling_face:
tada : Adam Feldman, Sijie Guo
2020-03-25 17:49:07 UTC - Addison Higham: @Joel Cressy ^^
2020-03-25 17:50:12 UTC - Joel Cressy: Awesome! glad to meet you
2020-03-25 17:51:16 UTC - Joel Cressy: i’m just beginning my research into this and am trying to gather as much info as I can
2020-03-25 17:51:29 UTC - Addison Higham: sure thing, let me put some details under a thread here
100 : Sijie Guo
+1 : Sijie Guo
2020-03-25 17:57:19 UTC - Addison Higham: So, the quick 5 minute version:
• we run on EKS, pretty much we started with the manifest definitions in the pulsar repo but have changed them quite a bit at this point, but nothing too fancy there, no real magic on an individual pulsar cluster
• We have many AWS accounts and VPCs as my company, so we expose pulsar via the pulsar proxy to other accounts using privatelink endpoints. We create the NLB for pulsar proxy just using a k8s service (see <>). Then we have some terraform that simply looks up the service in k8s and then creates the privatelink endpooint. 
• We use a combination of VPC peering, external DNS, and NLBs to run a global zookeeper on K8S AND to allow for geo replication. I can get into more detail on this later
2020-03-25 18:00:59 UTC - Addison Higham: now for auth: the basic concept is that we allow our users (via a small CLI tool and a microservice) to create an association between an IAM role and a pulsar role.

This association is basically stored in vault. Vault allows you to create policies that are "templated" based on the identity of who you are connect with. So for example, the following vault policy:
```pulsar/data/client/{{identity.entity.aliases.&lt;id of your aws iam service auth&gt;.name}}/creds/*```
2020-03-25 18:01:16 UTC - Addison Higham: we apply that policy to any roles we want to trust to any AWS accounts
2020-03-25 18:02:44 UTC - Addison Higham: basically, AWS roles that auth against vault can read from a location that include a unique identifier of their role. Our little microservice periodically drops creds to that same location (based on the association a user created)
2020-03-25 18:03:05 UTC - Joel Cressy: Are these creds a jwt that pulsar validates?
2020-03-25 18:04:33 UTC - Addison Higham: correct, we just use token based auth, but we use public/private key tokens so that the brokers/proxy only need the public key and private key can be limited to just signing tokens. Which we have a k8s cron job that basically just refreshes the tokens every few hours. Eventually, we want to write a vault plugin so that tokens are generated on demand, but this works for now
2020-03-25 18:05:47 UTC - Joel Cressy: Yeah, kinda sucks that vault doesn’t have a jwt secrets engine. You can auth to vault with jwt, but vault won’t generate/sign jwt’s for you.
2020-03-25 18:06:15 UTC - Addison Higham: we have a WIP plugin that we hope to open source when we get there, just sorta stalled out with lots of other priorities
2020-03-25 18:07:31 UTC - Addison Higham: but anyways, does that make sense? We have some tooling in our little CLI tool that can do the vault auth and fetch the token for you. That has made it easier for even teams that aren't totally up on vault yet to just get started by calling our CLI tool to fetch the token
2020-03-25 18:07:58 UTC - Joel Cressy: What kind of TTL is on these credentials?
2020-03-25 18:08:55 UTC - Addison Higham: oh and one important detail: the ID of the AWS role (or any AWS principal really) that vault uses for templating locations is the `UniqueID` which means that any tooling first needs to resolve role/user/etc to the unique ID before going to fetch the credential
2020-03-25 18:09:19 UTC - Addison Higham: (see this doc for details on that: <>)
2020-03-25 18:09:58 UTC - Joel Cressy: Oh, that’s like access key id for standard iam creds. When you assume role, you have one of these unique ID’s as the access key id along with secret and session/security token.
2020-03-25 18:10:20 UTC - Addison Higham: we let the user decide the TTL within a range, but right now, defaults to a year. We still have a lot of apps that aren't native k8s or with an existing vault integration yet. As that evolves, we want to drop the default TTL down to a day or something
2020-03-25 18:10:48 UTC - Joel Cressy: Awesome, yeah user configurable TTL is great because I also have a lot of things that don’t have native integrations.
2020-03-25 18:11:46 UTC - Joel Cressy: BTW, one of the first use cases I intend to run pulsar for is log ingest/forwarding. I will do traditional pub/sub later but current goals are to provide interfaces for log forwarders to send data (e.g. filebeat, fluentd/fluentbit, etc)
2020-03-25 18:11:49 UTC - Addison Higham: yeah, obviously not ideal, but since this is all over internal VPC (privatelink)  and we have a plan for apps to be able to use short lived tokens, that is where we are for now
2020-03-25 18:12:10 UTC - Addison Higham: heh, yeah, so I actually just wrote a fluentbit plugin
2020-03-25 18:12:36 UTC - Addison Higham: via the golang interface, haven't rolled it out at scale yet (next week likely), it isn't open source yet though
2020-03-25 18:12:53 UTC - Joel Cressy: I may also need find a way to map non-aws service identities to pulsar, but with vault the sky’s the limit.
2020-03-25 18:14:21 UTC - Addison Higham: yeah, so our little microservice we have, it can generate creds for users after an okta login. But yes, the same pattern with vault should work relatively sanely
2020-03-25 18:15:19 UTC - Joel Cressy: ooo, yeah we use okta too so i’d have to do something similar. Users auth to vault with okta, so maybe something could be done there.
2020-03-25 18:16:58 UTC - Arthur: @Arthur has joined the channel
2020-03-25 18:21:51 UTC - Arthur: Hey, I try pulsar on Kubernetes with official helm with only 3 kubernetes nodes. First try, I got affinity/antiaffinity error. I change replication to 1 for zookeeper, booker and broker but now I have "ManagedLedgerException: Not enough non-faulty bookies available". Is it possible to disable antiaffinity ?
2020-03-25 21:01:15 UTC - Sijie Guo: you need to change the broker config map to reduce ensemble size to 1.


I don’t think the current helm chart in master support disabling antiaffinity.

If you are looking for a chart doing so, you can try our chart. <>
use this values file (<>)
2020-03-26 00:24:34 UTC - Hiroyuki Yamada: Hi, I posted some question regarding Key_Shared behavior in the mailing list. Maybe my understanding and expectation are not correct, but it doesn't work as expected so far.
Can anyone help me ?
(BTW, I've tested it with 2.5.0)
2020-03-26 00:39:03 UTC - Sijie Guo: Just replied.
2020-03-26 01:27:25 UTC - Kannan: Hi, Running Pulsar 2.5.0 with istio 1.5 on AKS(k8s cluster). Is it possible to replace pulsar proxy with istio ingress ?
2020-03-26 02:00:29 UTC - Hiroyuki Yamada: Thank you !
I just replied back.
Hmm, seems like it is not working as expected. So, messages with the same key go to different consumers even if there is no new consumer joining the subscription.
2020-03-26 02:15:03 UTC - Andy Papia: we are using Keycloak (<|>) for authn/authz in our system.  has anyone though about creating an authentication plugin for keycloak or something standards-based like OAuth 2?
2020-03-26 02:18:02 UTC - Ali Ahmed: @Andy Papia no not yet but it should be simple to do here is a sample external auth plugin for pulsar
2020-03-26 02:18:03 UTC - Ali Ahmed: <>
2020-03-26 02:19:23 UTC - Andy Papia: thanks, that's helpful
2020-03-26 03:22:41 UTC - Kevin Hui: @Kevin Hui has joined the channel
2020-03-26 03:43:56 UTC - Sijie Guo: Interesting, @Penghui Li can you help check this?
man-bowing : Hiroyuki Yamada
2020-03-26 03:55:30 UTC - Evan Xu: @Evan Xu has joined the channel
2020-03-26 06:25:25 UTC - Amit Aggarwal: @Amit Aggarwal has joined the channel
2020-03-26 08:21:32 UTC - Amit Aggarwal: are there any known issues with ttlDurationDefaultInSeconds config setting in pulsar 2.5.0 ?
After setting it to 604800 (7 days), the messages were being expired immediately (5 min after producing)
2020-03-26 08:52:10 UTC - Sijie Guo: Can you describe the sequence and how do you verify that?