You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pulsar.apache.org by Apache Pulsar Slack <ap...@gmail.com> on 2018/10/06 09:11:03 UTC

Slack digest for #general - 2018-10-06

2018-10-05 11:23:47 UTC - dba: Hi all.
Some months ago I wrote and asked if there were any plans of creating a C#/.NET client. I just want to ask the same question and see if the answer is still "Sounds like a good idea but no" :slightly_smiling_face:
I have tried creating one myself, I can connect and send ping/pong, but I find the documentation on <http://pulsar.apache.org/docs/en/develop-binary-protocol/> a bit lacking in depth. Are there any other resources?
----
2018-10-05 13:06:06 UTC - Roopeshwar Devarajan: @Roopeshwar Devarajan has joined the channel
----
2018-10-05 13:26:48 UTC - Nicolas Ha: it worked btw :smile:
----
2018-10-05 15:02:23 UTC - Matteo Merli: @Nicolas Ha hostnames cannot contain `_` character 
+1 : Nicolas Ha
slightly_smiling_face : Nicolas Ha
----
2018-10-05 18:07:37 UTC - sfescape: Is there a way to override retention policy and delete message older then X? I want to set my retention policy to infinite and manage deletion manually.
----
2018-10-05 18:08:06 UTC - sfescape: I see there is a way to manually manage expiry, but not retention.
----
2018-10-05 18:16:17 UTC - Sijie Guo: @sfescape: <http://pulsar.apache.org/docs/en/cookbooks-retention-expiry/> is good cookbook to checkout for retentions snd expiry
----
2018-10-05 18:18:33 UTC - Sijie Guo: @dba:

&gt; creating a C#/.NET client

I think most of the committers in the community currently don’t really have C# or .NET experiences. so if there is anyone from ocmmunity can help, that would be great.

&gt; <http://pulsar.apache.org/docs/en/develop-binary-protocol/> a bit lacking in depth

unfortunately it is the only one :disappointed: what do you expect more? we can help you
----
2018-10-05 18:19:02 UTC - Grant Wu: Do we need to manually delete old subscriptions?
----
2018-10-05 18:20:44 UTC - Sanjeev Kulkarni: @dba AFAIK, there are no concrete plans for C#/.NET client yet. Ofcourse the community can help build this up.
----
2018-10-05 18:29:01 UTC - sfescape: @Sijie Guo I've read that. And I've read the command "pulsar-admin topics expire-messages topic options", but there doesn't seem to be the equivalent for "delete-messages".
----
2018-10-05 18:33:46 UTC - Sijie Guo: @sfescape: sorry I misunderstood your qustion, you are asking for manual *truncation* on infinite streams? is that the case?
----
2018-10-05 18:36:29 UTC - Dave Southwell: I've been running some various benchmarks from here: <http://openmessaging.cloud/docs/benchmarks/> and no matter what I do, it seems that the bookkeeper ledger storage disks only fill up, and are never cleaned once the benchmarks are stopped.  For example I ran a benchmark yesterday around 5 PM and today I look at the disk usage info for the ledger storage device and it's at the same usage amount.  Is there something about these benchmarks that are leaving things in such a state that garbage collection isn't running?  Maybe non-closed subscriptions or something?  I've checked for backlogged messages and they are quite minimal.
----
2018-10-05 18:39:19 UTC - Sijie Guo: @Dave Southwell can you run topic stats or stats-internal to get the stats of topics? the data might not be deleted because of retention settings. getting those topic stats would help understand the situation.
----
2018-10-05 18:42:13 UTC - Dave Southwell: ```{
  "entriesAddedCounter" : 1544,
  "numberOfEntries" : 832,
  "totalSize" : 86879647,
  "currentLedgerEntries" : 832,
  "currentLedgerSize" : 86879647,
  "lastLedgerCreatedTimestamp" : "2018-10-05T02:08:02.51Z",
  "lastLedgerCreationFailureTimestamp" : "2018-10-05T02:07:51.956Z",
  "waitingCursorsCount" : 1,
  "pendingAddEntriesCount" : 0,
  "lastConfirmedEntry" : "463:831",
  "state" : "LedgerOpened",
  "ledgers" : [ {
    "ledgerId" : 463,
    "entries" : 0,
    "size" : 0,
    "offloaded" : false
  } ],
  "cursors" : {
    "sub-000" : {
      "markDeletePosition" : "463:828",
      "readPosition" : "463:832",
      "waitingReadOp" : false,
      "pendingReadOps" : 0,
      "messagesConsumedCounter" : 1541,
      "cursorLedger" : 566,
      "cursorLedgerLastEntry" : 127,
      "individuallyDeletedMessages" : "[]",
      "lastLedgerSwitchTimestamp" : "2018-10-05T02:05:46.362Z",
      "state" : "Open",
      "numberOfEntriesSinceFirstNotAckedMessage" : 4,
      "totalNonContiguousDeletedMessagesRange" : 0,
      "properties" : { }
    }
  }
}```
----
2018-10-05 18:43:13 UTC - Matteo Merli: The topic in question has 1 ledger of ~86MB and that data is not being deleted from BookKeeper
----
2018-10-05 18:43:43 UTC - Dave Southwell: Is the data not being deleted, because there's a subscriber?
----
2018-10-05 18:44:20 UTC - Dave Southwell: Under what conditions will data be deleted from bookkeeper, I guess is another way to ask the question.
----
2018-10-05 18:44:36 UTC - Matteo Merli: That and also that the topic will always have 1 ledger open. Since there were no more writes on the topic, the ledger rollover was not performed
----
2018-10-05 18:46:17 UTC - Matteo Merli: Also there are few unacked messages at the end of the topic. This would have happened when the benchmark terminated
----
2018-10-05 18:47:30 UTC - Dave Southwell: Ok.  So, it seems running the benchmarks may leave some garbage that needs manual cleanup.  Asside from maybe scripting something to use pulsar-admin and or the API is there anyway to remove topics in bulk?
----
2018-10-05 18:48:00 UTC - Dave Southwell: Or say, remove all topics from tenant/namespace/* ?
----
2018-10-05 18:50:27 UTC - Matteo Merli: there’s one with unsubscribe :

```
pulsar-admin namespaces unsubscribe tenant/namespace -s sub-000
```

This will drop that subscription on all the topics within a namespace. In turn, since these topics are not used anymore they will be deleted as well, deleting the BK ledgers and then freeing the space
+1 : Dave Southwell, Ali Ahmed, Nicolas Ha
----
2018-10-05 19:58:15 UTC - sfescape: @Sijie Guo yes, along the lines of expire-messages I'd like to be able to specify deletion of messages older then X seconds.
----
2018-10-05 20:00:12 UTC - Matteo Merli: @sfescape So you wan to set infinite retention on a namespace and then manually truncate the topic? It’s not currently possible but it should not be difficult to add.
----
2018-10-05 20:01:17 UTC - sfescape: exactly. If I had that command I could expire all messages older then X and then delete all messages older then X.
----
2018-10-05 20:17:40 UTC - Sijie Guo: @sfescape: <https://github.com/apache/pulsar/issues/2736> created an issue for that. /cc @Matteo Merli
----
2018-10-05 22:09:32 UTC - William: Is anyone running a low-volume Pulsar instance? Say.... 200 messages a second at peak?
----
2018-10-05 22:09:54 UTC - William: I'm wondering if I can run this on a $5 Droplet.... :smiley:
----
2018-10-05 22:10:19 UTC - Grant Wu: I think you’ll probably be fine?
----
2018-10-05 22:10:36 UTC - William: I guess the worst that happens is that it becomes a $10 Droplet.
----
2018-10-05 22:11:07 UTC - William: I'm also wondering about authz/n best practices.
----
2018-10-05 22:11:18 UTC - William: Athenz server seems like a pain to set up.
----
2018-10-05 22:11:31 UTC - William: Rolling, per user TLS certs also seem painful to set up.
----
2018-10-05 22:11:54 UTC - William: Do you guys know of any script-kiddie dockerized solutions out there?
----
2018-10-05 22:11:56 UTC - Grant Wu: 1 GB RAM may be a little low
----
2018-10-05 22:12:05 UTC - Grant Wu: I don’t understand what Docker has to do with anything
----
2018-10-05 22:12:13 UTC - Grant Wu: There’s a Pulsar docker image, yes
----
2018-10-05 22:12:34 UTC - William: Because I can pull the image and hit Go... For the docker image I'm asking about Athenz or rolling TLS certs.
----
2018-10-05 22:12:52 UTC - Grant Wu: I think if you’re running a small-scale deployment, you should probably be fine running Pulsar unauthenticated, and relying on network isolation?
----
2018-10-05 22:13:07 UTC - Grant Wu: Do you want to have end users talk directly to Pulsar?
----
2018-10-05 22:13:13 UTC - William: The real question I'm asking (myself) here is if I want to roll my next prototype with Pulsar and an API gateway or ANOTHER Django project.
----
2018-10-05 22:13:25 UTC - William: Yes - location and clickstream direct to Pulsar.
----
2018-10-05 22:13:42 UTC - William: Otherwise I'm limited by my API gateway implementation at scales &gt;200 msgs/second
----
2018-10-05 22:14:03 UTC - Grant Wu: Hrm.
----
2018-10-05 22:14:18 UTC - Grant Wu: You’ll have to ask the actual Pulsar devs about this, but I’m not sure how well this would work
----
2018-10-05 22:15:48 UTC - Grant Wu: It seems like produce/consume permissions are available on a per-namespace granularity
----
2018-10-05 22:15:59 UTC - Grant Wu: Don’t know if that’s fine enough for you
----
2018-10-05 22:18:16 UTC - Sijie Guo: @William - we actually ran pulsar on an edge box before (1GB memory, ~30GB disk -&gt; but can be lesser). the current pulsar docker image is available under apachepulsar/pulsar. if you just need basic pub/sub mechanism, it is possible to trim the size of the docker image down.
----
2018-10-05 22:19:23 UTC - William: :slightly_smiling_face: thanks! I created a lightning tracker with Pulsar and NASA sensor data last week on an authentication-required namespace. It ran great on my virtual machine with 1GB RAM and 2 CPUs
----
2018-10-05 22:19:45 UTC - William: But that was definitely less than 200/second
----
2018-10-05 22:20:11 UTC - Grant Wu: <https://github.com/yahoo/athenz> doesn’t seem to have a docker image, no
----
2018-10-05 22:20:15 UTC - William: My BIGGEST fear is that I'm going to get into authz/n hell - I'm definitely wondering the easiest way to go about that. Devops are not my forte.
----
2018-10-05 22:22:34 UTC - Grant Wu: I don’t think there’s an easy to roll out implementation of this, no
----
2018-10-05 22:24:20 UTC - Sijie Guo: I am not an expert of athenz, @Matteo Merli @Rajan Dhabalia they might have a better idea about the athenz :slightly_smiling_face:
----
2018-10-05 22:26:53 UTC - Matteo Merli: let’s just say my experiences with Athenz were that it’s a bit difficult to get started…
----
2018-10-05 22:28:54 UTC - Matteo Merli: @William there’s also the option of using basic-auth — It’s not well documented but it’s easy to setup (though only implemeneted in java client for now)
----
2018-10-05 22:29:22 UTC - Matteo Merli: (basic auth in that just passing user/password and maintaining passwd file on brokers
----
2018-10-05 22:29:26 UTC - William: Hah! Coming from you, that's daunting.
----
2018-10-05 22:29:54 UTC - William: For some reason I thought HTTP basic auth was unsupported.
----
2018-10-05 22:30:10 UTC - William: My brain was telling me TLS or Athenz and that's it.
----
2018-10-05 22:30:27 UTC - Grant Wu: Uh, I mean, it’s undocumented
----
2018-10-05 22:30:31 UTC - Matteo Merli: yes.. documentation is lacking tehre :slightly_smiling_face:
----
2018-10-05 22:30:33 UTC - Grant Wu: that’s probably why you thought that :stuck_out_tongue:
----
2018-10-05 22:30:48 UTC - Grant Wu: This is the first I’ve heard of it as well
----
2018-10-05 22:30:54 UTC - Grant Wu: Does this work with websockets?
----
2018-10-05 22:31:38 UTC - Grant Wu: Oh, no, you just said Java client only
----
2018-10-05 22:32:00 UTC - Grant Wu: I get the feeling that @William has Javascript clients in a web page in mind
----
2018-10-05 22:32:31 UTC - William: I don't necessarily.
----
2018-10-05 22:32:35 UTC - William: I don't even LIKE Javascript.
----
2018-10-05 22:32:49 UTC - William: But I do like mobile apps and streaming location/clickstream,
----
2018-10-05 22:33:14 UTC - William: The namespacing and geo-replication for me here is incredible.
----
2018-10-05 22:33:36 UTC - William: So I WANT to use it. I am just wary of getting mired down in these authetnication issues.
----
2018-10-05 22:33:49 UTC - Matteo Merli: For basic auth: the best source is the PR itself at this point: <https://github.com/apache/pulsar/pull/1087>
----
2018-10-05 22:34:47 UTC - Grant Wu: Okay, I just wasn’t sure
----
2018-10-05 22:35:15 UTC - Grant Wu: I have no idea if the Java client can run in Android (unlikely, imo)
----
2018-10-05 22:35:51 UTC - Karthik Ramasamy: @William - FYI, we got Pulsar running in a Raspberry PI with 1 GB and 2 ARM cores with an SD card of 32 GB
----
2018-10-05 22:35:51 UTC - Matteo Merli: BTW: I believe the basic auth should also work in WebSocket proxy
----
2018-10-05 22:36:00 UTC - Karthik Ramasamy: it works very well
----
2018-10-05 22:36:09 UTC - William: @Karthik Ramasamy That's incredible!
----
2018-10-05 22:36:33 UTC - William: Thanks guys.
----
2018-10-05 22:36:46 UTC - William: It sounds like a "simple" implementation will require basic auth at first.
----
2018-10-05 22:40:07 UTC - Nicolas Ha: I am starting to play the Pulsar helm on minikube.
I am not sure past the experimentation phase, how am I supposed to manage persistent storage on a real cluster for pulsar? I did a bit or reading - it looks like I need dynamic storage allocation.
Do I have to setup gluster or ceph? Or anything else (whatever pulsar prefers)? Any guidelines there?
----
2018-10-05 22:46:35 UTC - Sijie Guo: @Nicolas Ha I think current helm charts in pulsar repo assumes persistent volumes, which you can use it easily on clouds. since clouds provide good support on persistent volumes.

if you have on-prem kubernetes, you have two options. one is setup persistent volumes (e.g. ceph), however that would introduce your ops load. the other one is to change the helm charts to deploy bookie in daemonsets, which then it will use local disks and bookkeeper itself will manages replications.  you can use this file <https://github.com/apache/pulsar/blob/master/deployment/kubernetes/generic/bookie.yaml> as an example on how to deploy bookies using daemonset
----
2018-10-05 23:01:22 UTC - Nicolas Ha: Thanks for the reply, yes I am on-prem at the moment.
regarding ceph, I don’t get your comment ” that would introduce your ops load” ? But I am exactly after that: opinions of someone who has been there :slightly_smiling_face:
----
2018-10-05 23:01:41 UTC - Nicolas Ha: Or do you think daemonset is just simpler? (you mean editing this file to use Daemonset right? <https://github.com/apache/pulsar/blob/master/deployment/kubernetes/helm/pulsar/templates/bookkeeper-statefulset.yaml>)
----
2018-10-05 23:07:34 UTC - Sijie Guo: &gt; ” that would introduce your ops load” ?

If you already have ceph, that is fine; if you don’t have ceph, then you have to setup ceph and manage it :slightly_smiling_face:

&gt; do you think daemonset is just simpler?

if you don’t have ceph, then using daemon set is easier, because you don’t have another system to manage.

&gt; ou mean editing this file to use Daemonset right

yes.
slightly_smiling_face : Nicolas Ha
+1 : Nicolas Ha
----
2018-10-05 23:13:29 UTC - Grant Wu: “introduce your ops load” probably means “increase your ops load” or “introduce an extra ops burden”
+1 : Sijie Guo, Nicolas Ha
----
2018-10-05 23:14:13 UTC - Matteo Merli: and extra replication… BookKeeper is already capable of replicating data (and very good at it)
slightly_smiling_face : Nicolas Ha
bowtie : Nicolas Ha
----
2018-10-05 23:21:33 UTC - Nicolas Ha: I don’t get those kubernetes files yet, but does it mean the zookeeper one will have to be updated too to use a Daemonset?
( <https://github.com/apache/pulsar/blob/master/deployment/kubernetes/helm/pulsar/templates/zookeeper-statefulset.yaml> )
----
2018-10-05 23:30:08 UTC - Sijie Guo: good question. yes, zookeeper has to be daemonset as well
+1 : Nicolas Ha
----