You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pulsar.apache.org by Apache Pulsar Slack <ap...@gmail.com> on 2019/08/30 09:11:02 UTC

Slack digest for #general - 2019-08-30

2019-08-29 09:11:30 UTC - geal: IIRC there’s some kind of copyright assignment with the foundation, right?
----
2019-08-29 09:11:33 UTC - Sijie Guo: You can submit a PIP and contribute the project back to ASF. Upon the contribution, you might need to sign a SGA (Software Grant Agreement) to transfer the repo to be under apache.
----
2019-08-29 09:11:54 UTC - geal: right, I see
----
2019-08-29 09:12:50 UTC - Sijie Guo: Once it lands in ASF, the Pulsar PMC is responsible for making sure the repo following the principles of ASF and doing releases following the Apache way.
----
2019-08-29 09:12:51 UTC - geal: on the feature side, what’s the priority in a client?
----
2019-08-29 09:12:56 UTC - Ali Ahmed: I don’t think apache foundation can have official projects associated with alternate licenses
----
2019-08-29 09:14:00 UTC - Ali Ahmed: I can’t say most of the work right no is around node js and go client, there hasn’t been any discussion of rust client
----
2019-08-29 09:17:12 UTC - Sijie Guo: as a MVP of a new language client, making sure the client having the basic pub/sub messaging features is the top priority.

- produce: batch, unbatch, compressed, no-compressed
- consumer: support all subscription types, ack individual, ack cummulatively
- partitioned and non-partitioned
- reader features
- common logic: retry
----
2019-08-29 09:18:45 UTC - geal: right, that’s about what I had in mind
----
2019-08-29 09:18:52 UTC - geal: (it’s very limited right now)
----
2019-08-29 09:19:37 UTC - geal: there are 4 contributors for the project, so a relicensing would be doable
----
2019-08-29 09:19:54 UTC - Sijie Guo: :+1:
----
2019-08-29 09:19:58 UTC - geal: or just, starting from a specific commit, dropping MIT
----
2019-08-29 09:21:26 UTC - Sijie Guo: @geal I think those license stuff can be done when you contribute a repo to ASF.
----
2019-08-29 09:22:18 UTC - Sijie Guo: if you contribute a repo as an ASF incubator project, incubator PMC will mentor you to do so; if you contribute the repo as a sub project to Pulsar, the Pulsar PMC will help you to do so.
----
2019-08-29 09:24:50 UTC - geal: I see
----
2019-08-29 13:13:02 UTC - bsideup: Is there anyone I can talk to about this?
<https://github.com/apache/pulsar/blob/83365ebd536af3b3575026a06d0b83cd506898d3/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedLedgerImpl.java#L2729>
Trying to understand why `+1` here
+1 : Kirill Merkushev
----
2019-08-29 13:36:29 UTC - Ivan Kelly: @bsideup @Matteo Merli would have to confirm, as it came with the initial commit, but it seems that isValidPosition returns true if a position could be valid at some point, not whether it currently exists
----
2019-08-29 13:36:39 UTC - Ivan Kelly: you seeing a bug with it?
----
2019-08-29 13:38:57 UTC - bsideup: I do :slightly_smiling_face:

Context:
<https://github.com/bsideup/liiklus/pull/176/files#diff-ce123c80dd3e12bcfbf6b16c9dba6dfdR61>

We sometimes observe errors when seek is performed. Apparently that’s due to that “+1” (which is not necessarily for regular seeks)
----
2019-08-29 13:39:53 UTC - bsideup: Scenario:

Every newly created consumer seeks to the latest known offset (stored *outside* of Pulsar, hence manual seek)
----
2019-08-29 13:40:22 UTC - Ivan Kelly: Reader or Consumer?
----
2019-08-29 13:44:35 UTC - bsideup: Consumer
----
2019-08-29 13:45:27 UTC - bsideup: `Failover` sub. type
----
2019-08-29 13:46:58 UTC - bsideup: Full code, if it helps:
<https://github.com/bsideup/liiklus/blob/49de3a30bbe5de4db7451bce898d953aa6ed93d7/plugins/pulsar-records-storage/src/main/java/com/github/bsideup/liiklus/pulsar/PulsarRecordsStorage.java#L163>
----
2019-08-29 13:47:12 UTC - Ivan Kelly: hmm, could be a bug then. Have you tried removing +1 and running the pulsar tests?
----
2019-08-29 13:48:33 UTC - bsideup: I haven’t (don’t have the environment set up) :slightly_smiling_face: Would you recommend doing so?
----
2019-08-29 13:49:26 UTC - Ivan Kelly: I would expect whatever breaks would be the reason it is like that :slightly_smiling_face:
----
2019-08-29 13:54:11 UTC - Quentin ADAM: ping me if you need help oin it
----
2019-08-29 13:59:08 UTC - Matteo Merli: The reason for the +1 is that the read position always points to the “next message to be read”. When the cursor is at the end of the topic, it will be positioned to a non-existing entry (and typically the cursor get notified when the next entry is persisted)
----
2019-08-29 14:00:31 UTC - Matteo Merli: Need to get at least one coffee before looking at the other code ;)
----
2019-08-29 14:02:12 UTC - bsideup: but the consumer never knows whether he is at the end of the topic or not, and the exception is a bit too defensive here it seems
heavy_plus_sign : Kirill Merkushev
----
2019-08-29 14:02:30 UTC - Kim Christian Gaarder: Using the java client, I get an error after using seek(timestamp). If I attempt to seek again, the consumer used to seek is closed. Is this a known problem?
----
2019-08-29 14:03:31 UTC - bsideup: Are there any examples of externally managed offsets maybe?
----
2019-08-29 14:38:45 UTC - Kim Christian Gaarder: Bug reported here: <https://github.com/apache/pulsar/issues/5073>
----
2019-08-29 14:52:04 UTC - bsideup: Another question: why does Pulsar disallow seeking to an offset pointing to some old ledger?
<https://github.com/apache/pulsar/blob/83365ebd536af3b3575026a06d0b83cd506898d3/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedLedgerImpl.java#L2735-L2736>
----
2019-08-29 14:53:44 UTC - bsideup: Use case: rewind processing to the begging or just some old offset
----
2019-08-29 15:02:11 UTC - bsideup: And one more: according to the user report ( <https://github.com/bsideup/liiklus/pull/176#discussion_r319117674> ), manual seek of one partitioned consumer seems to affect other consumers.

Have you observed something like that before? Or is it “by design” (although, would be a very unfortunate design for us :slightly_smiling_face:)
----
2019-08-29 16:36:57 UTC - Ryan Samo: Is there a way for a Pulsar producer to change the schema type? I know the first message produced sets the initial schema but how can they update the schema?
----
2019-08-29 16:39:48 UTC - Addison Higham: so two things we just noticed with the golang client:
- the consumer stats aren't exposed on the consumer interface, they are plumbed through in the C++ api so it seems like it should be fairly straightforward
- redelivery count isn't exposed in go OR in libpulsar, it seems like it is a bit bigger of a change, need to plumb through the value from `CommandMessage` struct down into where the `Message` gets created to match the design of java API
----
2019-08-29 16:57:12 UTC - Matteo Merli: I see, so you’re using the subscription but always resetting that to a particular message id. When that message id is at the end of the topic, you’re getting exception.

Is that the right description?
----
2019-08-29 16:58:49 UTC - Matteo Merli: &gt; Are there any examples of externally managed offsets maybe?

The typical scenario would be by using a Reader instead of Consumer so that the consumption position is always determined by the app.

(That doesn’t mean that seek() shouldn’t work correctly as well…)
----
2019-08-29 17:07:17 UTC - Poule: is there a Python function example somewhere that uses PickleSerde?
----
2019-08-29 17:09:37 UTC - Matteo Merli: @bsideup the check there is for a ledger that is missing.

eg: `ledgers = [ 1, 3, 9, 10 ]`

`seek( ledgerId: 5, entryId: 2 )`  -&gt; fail
`seek( ledgerId: 3, entryId: 2 )`  -&gt; success
+1 : bsideup
----
2019-08-29 17:10:36 UTC - Poule: or better an example of Python Fucntion that outputs to an Avro Topic
----
2019-08-29 17:11:09 UTC - Poule: do I need to Pickle to output to Avro topic?
----
2019-08-29 17:11:27 UTC - Matteo Merli: That shouldn’t be the case. I don’t understand in that report that “consumer does not reconnect”.

With failover subscription type, whenever a consumer gets disconnected, another available consumer will take over
----
2019-08-29 17:12:37 UTC - Matteo Merli: Regarding the disconnect. the current behavior is indeed to disconnect all consumers, perform the seek and let everyone to to reconnect.

The disconnect is ultimately unnecessary and was already removed in master.
+1 : Kirill Merkushev, bsideup
----
2019-08-29 17:13:33 UTC - Matteo Merli: At this point, the Python functions are not yet integrated with the schema, so you would have to deal with the serialization directly
----
2019-08-29 17:14:34 UTC - Poule: how can I do this?
----
2019-08-29 17:19:36 UTC - Matteo Merli: uhm.. sorry my answer above was not precise.

In Python function, the serialization is indeed defined by the Serde that is configured.

What it does not (yet) do is to use that as the “schema” for the topic (to be validated and enforced by brokers).

You can still define an Avro Serde and define that when you create the function (`--output-serde-classname MyAvroSerde`)
----
2019-08-29 17:20:07 UTC - Matteo Merli: however that won’t automatically set the topic schema to Avro with a specific schema
----
2019-08-29 17:20:35 UTC - Retardust: @Anonymitaet cool, any idea when? tag me here when it's happens, ok? we probably will test that in near future for our legacy system replication needs. It's hard to use java or any other lang pulsar clients there:(
----
2019-08-29 17:21:18 UTC - Sijie Guo: I think what you are looking for is close to the issue described here: <https://github.com/apache/pulsar/issues/4806>
----
2019-08-29 17:21:39 UTC - Sijie Guo: See the discussion in <https://github.com/apache/pulsar/pull/5056#issuecomment-525519313>
----
2019-08-29 17:22:24 UTC - Ryan Samo: Thanks @Sijie Guo I’ll check it out
----
2019-08-29 18:29:49 UTC - Poule: not sure what to put in `serialize()` and `deserialize()` though
Do I need to go find a python lib that translates avro-to-python_dict and python_dict-to-avro?
----
2019-08-29 18:30:09 UTC - Poule: is it the idea
----
2019-08-29 18:30:59 UTC - Poule: because there is no`AvroSerDe` in <https://github.com/apache/pulsar/blob/master/pulsar-client-cpp/python/pulsar/functions/serde.py>
----
2019-08-29 18:31:28 UTC - Matteo Merli: yes, you would have to define the Avro serde
----
2019-08-29 18:32:01 UTC - Poule: ok I begin to understand
----
2019-08-29 18:35:03 UTC - Retardust: is there any build in pulsar functions? I have a very simple case, two topics "buffer" and "target". buffer fills with changes and should be connect to the target only after some external event. So I want to start function that sinks "buffer" to the "target" on event occurred. Is there any simple bridge function in pulsar or I need to write my own and deploy it?)
----
2019-08-29 18:37:37 UTC - Poule: @Retardust i think you need to write it.
----
2019-08-29 19:25:30 UTC - Ming Fang: I’m looking to use NiFi with Pulsar.  NiFi processors seems to fulfill the same role as Pulsar Functions/Source/Sink.  Are there any advantages to use the Pulsar Functions over using NiFi processors?
----
2019-08-29 19:26:18 UTC - Ali Ahmed: @Ming Fang You have to run nifi instances yourself, pulsar function worker can run the instances for you and scale out as needed
----
2019-08-29 19:28:18 UTC - Ming Fang: Yes that’s true. Although I actually like the idea of not running processing on the broker nodes to avoid resource contention.  If I use Pulsar Functions I would run them as separate Kubernetes pods.
----
2019-08-29 19:29:31 UTC - Ming Fang: The nice thing about NiFi is that it has hundreds on connectors, has NiFi Registry + git integration for deployment, and has visual tool for development and debuging
----
2019-08-29 19:31:14 UTC - Ali Ahmed: sure if your compute is stateless that could work, with functions you have a replicated state store built in, it all depends on you specific needs.
----
2019-08-29 19:34:45 UTC - Ming Fang: You’re absolutely right! state is the key advantage. I think I’m going to build a new NiFi processor call PulsarFunction to complement the existing Consumer/Publisher processors
----
2019-08-29 19:35:48 UTC - David Kjerrumgaard: @Ming Fang In addition to the lack of state, NiFi only provides you with a limited number of processors with which to work. So if you are able to perform the necessary processing using one or more of those processors, then you are fine. With Pulsar functions you can implement any Java, Python, or Go code that you like (including using third-party libraries).
----
2019-08-29 19:38:16 UTC - Ming Fang: That’s an excellent point. It would be perfect if NiFi runs it processors inside a Docker container.  That way I can package anything I want to avoid this limitation
----
2019-08-29 19:38:18 UTC - David Kjerrumgaard: @Ming Fang When I wrote the NiFi processors, I envisioned a use case where you can leverage the existing suite of NiFi connectors to feed data into a Pulsar topic, have some Pulsar functions perform the processing you require and publish to a "results" topic. Then, if you like, you can consume from the results topic in another NiFi topic.  Sort of a hybrid and complementary approach.
----
2019-08-29 19:40:40 UTC - David Kjerrumgaard: This is a common pattern with NiFi, to use it to push data into a messaging system such as Pulsar or Kafka for downstream processing. Again, one of the biggest limitations of NiFi is the lack of disk space for storing data. It wasn't designed to hold incoming data for days, it was designed to move data from system A to system B quickly.
----
2019-08-29 19:43:15 UTC - Ming Fang: @David Kjerrumgaard Thanks for your insight. It was very helpful.  I’m going to play around to see if it’s feasible to run Pulsar Functions inside NiFi.
----
2019-08-29 19:45:18 UTC - David Kjerrumgaard: It might be possible, but bear in mind that if you use a processor that has high latency (such as a pulsar function) it will introduce backpressure to the entire flow, which might result in data loss from the source systems.
----
2019-08-29 19:47:24 UTC - David Kjerrumgaard: I would recommend looking at the ExecuteGroovyScript processor in NiFi as an example of a "wrapper" processor that wraps random code.
+1 : Ming Fang
----
2019-08-29 20:25:24 UTC - Kirill Merkushev: &gt;Is that the right description?
Yes, that what happens (I’m aware of the case described)
----
2019-08-29 21:49:30 UTC - bsideup: &gt; The typical scenario would be by using a Reader

but it doesn’t support per-partition reader, doesn’t it?
----
2019-08-29 23:34:25 UTC - jialin liu: Hi, how can I change the max message size?
----
2019-08-29 23:35:00 UTC - jialin liu: I tested with pulsar-perf with 1MB msg size:  ./pulsar-perf produce -r 100 -s 1024000 -n 1 -t 1 -time 120 <persistent://public/functions/test1_topic1>
----
2019-08-29 23:35:10 UTC - jialin liu: run into error like: 23:27:07.713 [pulsar-timer-5-1] WARN  org.apache.pulsar.client.impl.ProducerImpl - [<persistent://public/functions/test1_topic1_msg1m>] [local-3-0] error while create opSendMsg by batch message container -- io.netty.util.internal.OutOfDirectMemoryError: failed to allocate 16777216 byte(s) of direct memory (used: 956301312, max: 960954368)
----
2019-08-29 23:35:17 UTC - jialin liu: can anybody advise ?
----
2019-08-29 23:42:50 UTC - David Kjerrumgaard: @jialin liu Use the `maxMessageSize` property in the `broker.conf` file
----
2019-08-29 23:47:16 UTC - David Kjerrumgaard: @jialin liu The error above seems to indicate that you are running out of direct memory, which Netty uses to store incoming messages. So you might want to increase the amount of direct memory (if possible) using the `-XX:MaxDirectMemorySize` JVM switch
----
2019-08-30 00:29:39 UTC - jialin liu: Thanks @David Kjerrumgaard
----
2019-08-30 00:40:10 UTC - jialin liu: 
----
2019-08-30 00:40:38 UTC - jialin liu: @David Kjerrumgaard we have maxdirect memory size as 2G, but still facing the same error
----
2019-08-30 01:35:13 UTC - Chris DiGiovanni: Running Pulsar 2.3.0  and having issues with autorecovery not being able to finish ledgerreplication.  On my bookies I'm seeing errors like the below:
```
2019-08-29 20:33:12.858 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.PendingReadOp - Read of ledger entry failed: L3388 E301524-E301524, Sent to [<http://chhq-vuppulbk01.us.drwholdings.com:3181|chhq-vuppulbk01.us.drwholdings.com:3181>, <http://chhq-vuppulbk04.us.drwholdings.com:3181|chhq-vuppulbk04.us.drwholdings.com:3181>, <http://chhq-vuppulbk03.us.drwholdings.com:3181|chhq-vuppulbk03.us.drwholdings.com:3181>], Heard from [] : bitset = {}, Error = 'Too many requests to the same Bookie'. First unread entry is (-1, rc = null)
2019-08-29 20:33:12.858 [BookKeeperClientScheduler-OrderedScheduler-0-0] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - Timed-out 123 operations to channel [id: 0xf6a3fc7d, L:/10.8.53.81:45124 - R:<http://chhq-vuppulbk01.us.drwholdings.com/10.8.53.66:3181|chhq-vuppulbk01.us.drwholdings.com/10.8.53.66:3181>] for <http://chhq-vuppulbk01.us.drwholdings.com:3181|chhq-vuppulbk01.us.drwholdings.com:3181>
2019-08-29 20:33:12.858 [BookKeeperClientWorker-OrderedExecutor-0-0] INFO  org.apache.bookkeeper.client.PendingReadOp - Error: Bookie operation timeout while reading L3388 E300040 from bookie: <http://chhq-vuppulbk01.us.drwholdings.com:3181|chhq-vuppulbk01.us.drwholdings.com:3181>
2019-08-29 20:33:12.858 [BookKeeperClientWorker-OrderedExecutor-0-0] INFO  org.apache.bookkeeper.client.PendingReadOp - Error: Bookie operation timeout while reading L3388 E300190 from bookie: <http://chhq-vuppulbk01.us.drwholdings.com:3181|chhq-vuppulbk01.us.drwholdings.com:3181>
2019-08-29 20:33:12.858 [BookKeeperClientWorker-OrderedExecutor-0-0] INFO  org.apache.bookkeeper.client.PendingReadOp - Error: Bookie operation timeout while reading L3388 E300202 from bookie: <http://chhq-vuppulbk01.us.drwholdings.com:3181|chhq-vuppulbk01.us.drwholdings.com:3181>
2019-08-29 20:33:12.858 [BookKeeperClientWorker-OrderedExecutor-0-0] INFO  org.apache.bookkeeper.client.PendingReadOp - Error: Bookie operation timeout while reading L3388 E300118 from bookie: <http://chhq-vuppulbk01.us.drwholdings.com:3181|chhq-vuppulbk01.us.drwholdings.com:3181>
2019-08-29 20:33:12.858 [BookKeeperClientWorker-OrderedExecutor-0-0] INFO  org.apache.bookkeeper.client.PendingReadOp - Error: Bookie operation timeout while reading L3388 E300199 from bookie: <http://chhq-vuppulbk01.us.drwholdings.com:3181|chhq-vuppulbk01.us.drwholdings.com:3181>
2019-08-29 20:33:12.858 [BookKeeperClientWorker-OrderedExecutor-0-0] INFO  org.apache.bookkeeper.client.PendingReadOp - Error: Bookie operation timeout while reading L3388 E300067 from bookie: <http://chhq-vuppulbk01.us.drwholdings.com:3181|chhq-vuppulbk01.us.drwholdings.com:3181>
2019-08-29 20:33:12.858 [BookKeeperClientWorker-OrderedExecutor-0-0] INFO  org.apache.bookkeeper.client.PendingReadOp - Error: Bookie operation timeout while reading L3388 E302679 from bookie: <http://chhq-vuppulbk01.us.drwholdings.com:3181|chhq-vuppulbk01.us.drwholdings.com:3181>
```
----
2019-08-30 01:37:43 UTC - Anonymitaet: @Retardust will document ASAP and keep u updated 
----
2019-08-30 01:37:45 UTC - Chris DiGiovanni: Any ideas would be helpful.  Disabling auto recovery and re-enabling sometimes helps but it has been getting stuck at 1 unreplicated ledger
----
2019-08-30 01:52:16 UTC - Sijie Guo: user ledger command `bin/bookkeeper shell` to check the metadata for ledger `3388`?
----
2019-08-30 01:54:10 UTC - Chris DiGiovanni: What exactly am I looking for in this output?
----
2019-08-30 01:56:11 UTC - Chris DiGiovanni: Filled up my scrollback buffer in tmux :shrug:
----
2019-08-30 02:05:00 UTC - Chris DiGiovanni: Well I'm stuck at a single ledger that is unreplicated.  I eneded up running ledgermetadata on the unreplicated ledgerid 9780 and below is the output:

```
ledgerID: 9780
LedgerMetadata{formatVersion=2, ensembleSize=3, writeQuorumSize=3, ackQuorumSize=2, state=CLOSED, length=2147599016, lastEntryId=27861, digestType=CRC32C, password=base64:, ensembles={0=[<http://chhq-vuppulbk06.us.drwholdings.com:3181|chhq-vuppulbk06.us.drwholdings.com:3181>, <http://chhq-vuppulbk01.us.drwholdings.com:3181|chhq-vuppulbk01.us.drwholdings.com:3181>, <http://chhq-vuppulbk08.us.drwholdings.com:3181|chhq-vuppulbk08.us.drwholdings.com:3181>], 19516=[<http://chhq-vuppulbk06.us.drwholdings.com:3181|chhq-vuppulbk06.us.drwholdings.com:3181>, <http://chhq-vuppulbk04.us.drwholdings.com:3181|chhq-vuppulbk04.us.drwholdings.com:3181>, <http://chhq-vuppulbk08.us.drwholdings.com:3181|chhq-vuppulbk08.us.drwholdings.com:3181>]}, customMetadata={component=base64:bWFuYWdlZC1sZWRnZXI=, pulsar/managed-ledger=base64:ZmlvL2RlZmF1bHQvcGVyc2lzdGVudC9jbWU=, application=base64:cHVsc2Fy}}
```
----
2019-08-30 02:06:40 UTC - Chris DiGiovanni: Looks like the ledger is 2Gi large...  Not a lot of data.
----
2019-08-30 02:07:04 UTC - Chris DiGiovanni: Assuming length size in bytes...
----
2019-08-30 02:19:43 UTC - Chris DiGiovanni: `bookkeeper shell listunderreplicated -printreplicationworkerid -printmissingreplica` produces the following output.

```
9780
        Ctime : 1567094832243
        MissingReplica : <http://chhq-vuppulbk01.us.drwholdings.com:3181|chhq-vuppulbk01.us.drwholdings.com:3181>
        MissingReplica : <http://chhq-vuppulbk04.us.drwholdings.com:3181|chhq-vuppulbk04.us.drwholdings.com:3181>
        MissingReplica : <http://chhq-vuppulbk08.us.drwholdings.com:3181|chhq-vuppulbk08.us.drwholdings.com:3181>
        MissingReplica : <http://chhq-vuppulbk06.us.drwholdings.com:3181|chhq-vuppulbk06.us.drwholdings.com:3181>
```
----
2019-08-30 02:43:19 UTC - Ali Ahmed: I am thinking of adding a second message type to pulsar-client this one will separate messages on semicolon instead of comma , so the pulsar-cli can work with json type strings.
----
2019-08-30 02:43:27 UTC - Ali Ahmed: any opinions on the matter
----
2019-08-30 02:45:57 UTC - Igor Zubchenok: I researched a memory allocations in pulsar broker that causes a lot of GC especially in my case and there are my findings:
Image of java flight recorder, allocated 42GB memory per 5 minutes:
----
2019-08-30 02:46:09 UTC - Igor Zubchenok: 1. In bookkeeper client there is a ConcurrentSkipListMap that allocates a lot on iterating.
----
2019-08-30 02:46:25 UTC - Igor Zubchenok: 2. PositionImpl - this class looks immutable, however it is copied there and generates 6GB: <https://github.com/apache/pulsar/blob/e78beaaf4815509dc906838aaf4057a6c1445d0b/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedCursorImpl.java#L1849>
----
2019-08-30 02:46:37 UTC - Igor Zubchenok: 3. Is this serialization really needed? This allocates 820MB. <https://github.com/apache/pulsar/blob/7be1ee1fdb59421ac858b38840d3baf8c9073a5c/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/AbstractBaseDispatcher.java#L77>
----
2019-08-30 02:47:02 UTC - Igor Zubchenok: 4. I have 'expose topic metrics' disabled, but there is still allocations for topic stats when collecting data for prometheus. Called `NamespaceStatsAggregator.getTopicStats` and it allocates 200MB/5min, it not much, but why?
<https://github.com/apache/pulsar/blob/e78beaaf4815509dc906838aaf4057a6c1445d0b/pulsar-broker/src/main/java/org/apache/pulsar/broker/stats/prometheus/NamespaceStatsAggregator.java#L64>
----
2019-08-30 02:51:22 UTC - Chris DiGiovanni: Well I'm able to read the ledger just fine using bookkeeper shell readledger.
----
2019-08-30 02:52:28 UTC - Chris DiGiovanni: If i run bookkeeper shell recover on any of those bookkeepers I get those bookie operation timeouts.
----
2019-08-30 04:08:38 UTC - tuteng: @Retardust  You can refer to this document first, and we will gradually improve it later. <https://github.com/apache/pulsar/blob/4ddb51ff8c1b200f329cc70ca24fb8e02c0abbc4/site2/docs/io-netty.md>
----
2019-08-30 06:20:24 UTC - Retardust: I see, thanks! Is there at-most-once guaranties only for tcp?  Or I could receive acks somehow?
----
2019-08-30 06:33:12 UTC - tuteng: You can set semantics through this option --processing-guarantees Possible Values: [ATLEAST_ONCE, ATMOST_ONCE, EFFECTIVELY_ONCE]
----
2019-08-30 06:36:04 UTC - bsideup: right… I missed the null check, sorry :+1:
----
2019-08-30 06:37:00 UTC - bsideup: Great, thank you! Any ETA of this change? I assume this is a broker-side change, not client-side?
----
2019-08-30 06:38:18 UTC - Retardust: cool, and how it will work with tcp? I need receive ack on client side I suppose for at_least_once? I will experiment later)
----
2019-08-30 06:41:40 UTC - bsideup: How does rebalancing work with partitioned topics?
Consider we have 2 instances (services, on different machines) of our consumer (failover sub. type). The topic has 32 partitions.
The consumers subscribe to each sub-topic, so that every instance has 32 consumers (1 per partition, for per-partition processing).

How to ensure that the load is distributed equally between the *instances*?
----
2019-08-30 06:58:01 UTC - tuteng: This semantics is applied to the source, here netty source, so it can work here. The client receives and then ack according to normal logic
----
2019-08-30 07:01:15 UTC - Retardust: I mean on producer side, sorry. "our legacy" -&gt; "netty source" -&gt; "pulsar topic". How do I ensure that netty source send message to pulsar and receive ack on "our legacy" side?)
----
2019-08-30 07:02:11 UTC - Retardust: I mean "our legacy" is producer:)
----
2019-08-30 07:33:38 UTC - tuteng: I see, I think you can add an ack here <https://github.com/apache/pulsar/blob/master/pulsar-io/netty/src/main/java/org/apache/pulsar/io/netty/server/NettyServerHandler.java#L60> after successfully sending the data to topic. Similar to this function @Override
        public void ack() {
            <http://log.info|log.info>("netty record ack id is {}", this.id);
            connector.ack(this.id);
        }
----
2019-08-30 08:24:30 UTC - heathkang: @heathkang has joined the channel
----
2019-08-30 08:35:33 UTC - heathkang: Hi, I use docker pulsar and try to test write topic by go-client, but get a ` BrokerPersistenceError`:
```
2019/08/30 16:22:13.601 c_client.go:68: [info] WARN  | ClientConnection:902 | [[::1]:52026 -&gt; [::1]:6650] Received error response from server: 11 -- req_id: 0
2019/08/30 16:22:13.601 c_client.go:68: [info] ERROR | ProducerImpl:216 | [<persistent://public/default/my-topic>, ] Failed to create producer: BrokerPersistenceError
```
I am wondering what is `BrokerPersistenceError`? Thanks for your help.
----
2019-08-30 08:36:47 UTC - Rowanto: @Rowanto has joined the channel
wave : bsideup
----
2019-08-30 08:49:20 UTC - Rowanto: Actually the issue is more at line 2729 and 2744.

I checked out in Zookeeper on a ledger
`get /ledgers/00/0000/L2242`
which resulted in

```
...
ensembleSize: 2
length: 12066
lastEntryId: 20
state: CLOSED
...
```

doing a `seek( ledgerId: 2242, entryId: 21 )`  will only succeed if it's the latest ledger, because in line 2729, the condition is:
`return position.getEntryId() &lt;= (last.getEntryId() + 1);`

and in line 2724 (case when it's an old ledger), the condition is:
`return position.getEntryId() &lt; ls.getEntries()`
+1 : bsideup, Kirill Merkushev
100 : bsideup
----
2019-08-30 08:52:04 UTC - Rowanto: hence, our manual seek for older consumers are triggering a lot of InvalidCursorPositionException (in an endless loop)
----