You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pulsar.apache.org by Apache Pulsar Slack <ap...@gmail.com> on 2018/08/31 09:11:02 UTC

Slack digest for #general - 2018-08-31

2018-08-30 09:11:14 UTC - Poule: do you know when State will be supported in python functions?
snake : Poule
----
2018-08-30 09:26:01 UTC - Sijie Guo: @Poule it will be planned for 2.3 release.
hugging_face : Poule
----
2018-08-30 10:20:20 UTC - Chris Miller: OK thanks @Matteo Merli, that's really helpful. I wasn't aware of the part about permits (is that covered in the docs?). I'd mistakenly assumed "round robin" with 3 consumers meant strict delivery to Consumer #1, then #2. #3, back to #1. #2. #3 etc which is obviously not ideal.

ReceiverQueueSize=0 not being able to prefetch is exactly what i want for my use case (messages that take a consumer anywhere from a second to several minutes to process) since it prevents any one consumer from prefetching multiple messages that might take a long time, potentially leaving other consumers idle when the rest of the queue empties.

Do you have any comment on the "cannot be used with partitioned topics" aspect? It's not clear to me why that is the case.
----
2018-08-30 11:30:21 UTC - Kesav Kolla: Is there anyway to enable simple username/password security for a given topic?
----
2018-08-30 14:50:03 UTC - Matteo Merli: @Chris Miller The flow control mechanism is described in the wire protcol docs: <http://pulsar.apache.org/docs/en/develop-binary-protocol/#consumer>

&gt; Do you have any comment on the “cannot be used with partitioned topics” aspect? It’s not clear to me why that is the case.

Oh, I forgot to reply on this question yesterday. The reason is that when you have a “partitioned-topic” you still have 1 single “Consumer” instance.

With, `receiverQueueSize=0`, whenever you call `receive()` on the partitioned topic, the client library should send 1 single permit to broker, but it doesn’t know to which partition.  One option would be to use `receiverQueueSize=1` which is very close, though it does minimal prefetching of 1 message per partition.

Regarding the need to use partitions, keep in mind that a single partition topic can sustains up to 1M msg/s (with batching enabled) and ~100K msg/s (without batching). Since you can scale the consumers on a single partition, you would only need to use multiple partition if your expected throughput gets close to these value.
----
2018-08-30 14:52:26 UTC - Sivabalan Nagarajan: @Sivabalan Nagarajan has joined the channel
----
2018-08-30 14:54:00 UTC - Chris Miller: I see, thanks @Matteo Merli. 1M msg/s is plenty enough throughput for my case. I was thinking more about 10s of thousands of consumers all talking to the same broker and running into potential issues with the # of concurrent connections etc. But if receiverQueueSize=1 allows partitions to be used, that should be fine
----
2018-08-30 15:41:07 UTC - Poule: say i have a stream of "order" events that I want to enrich with a User Profile..
first stream "user_profile_created" that i can read with a function and then PutState("&lt;user_id&gt;", "{name: john}") to create a K/V store with all user profile.
second stream "orders" that i read with a function... then how can I getState(order.user_id....) from the previous K/V store to enrich the orders with users profile?
----
2018-08-30 16:04:33 UTC - Matteo Merli: @Kesav Kolla There is indeed. Although it was not added to the documentation :slightly_smiling_face:

There is support for BasicAuthentication type in which the broker has a file with principals and password. It should be used with TLS encryption enabled.

To get started, in client you need to specify `org.apache.pulsar.client.impl.auth.AuthenticationBasic` as the auth provider. In the config map, pass the `userId` and `password` arguments. Ref:  <https://github.com/apache/incubator-pulsar/blob/8b2929b3e5403fb44653976dc74be287666f7b96/pulsar-client/src/main/java/org/apache/pulsar/client/impl/auth/AuthenticationBasic.java>

In broker side, configure the provider as `org.apache.pulsar.broker.authentication.AuthenticationProviderBasic`. Also set system property `-Dpulsar.auth.basic.conf=/path/to/htpasswd/file`. Ref: <https://github.com/apache/incubator-pulsar/blob/8b2929b3e5403fb44653976dc74be287666f7b96/pulsar-broker-common/src/main/java/org/apache/pulsar/broker/authentication/AuthenticationProviderBasic.java>
----
2018-08-30 17:50:51 UTC - Sanjeev Kulkarni: Currently the context api supports getting state from the same function. It doesn’t support getting state from a different function. We could probably add this in 2.3
----
2018-08-30 17:56:50 UTC - Poule: would be great if k/v would not be tied to a function, and would just be usable  by referencing the k/v store name
----
2018-08-30 17:57:02 UTC - Poule: usable anywhere
----
2018-08-30 18:00:31 UTC - Poule: getstate('mystore1', 'user1') callable from any function... putstate('mystore1', 'user1', '{firstname:billy}') etc....
----
2018-08-30 18:04:46 UTC - Sijie Guo: @Poule: a better way maybe just keep current state interface under context as now. the api is accessing/mutating the function’s own state. and add another interface in the context, call externalFunctionState(“&lt;tenant&gt;/&lt;ns&gt;/&lt;function&gt;“) to retrieve a state accessor so you can access the state produced by external functions. this would probably make things a bit clearer. because ideally you don’t want multiple different functions mutating same state.
----
2018-08-30 18:15:20 UTC - Poule: what happen to the state if I delete the function
----
2018-08-30 18:23:52 UTC - Sijie Guo: @Poule good question! currently the state will still be kept if a function is deleted. there are some tooling and cleanup required for that. ideally when a function is removed, the state might need to be removed.

we probably can do better to allow user specify the kv table name on submission, so when the function is deleted, the table will still there. and then the function context can provide openStateTable(“table_name”) to access the state table by table name. in this way, the application manages the state tables outside of the lifecycles of functions. and we will provide admin tools to create/delete those state tables.
----
2018-08-30 18:33:56 UTC - Poule: would it be possible to make a second type of  state table auto-sync with a compacted topic? So i'd just publish to topic `user_profile` and it keeps the state table in sync. It would remove the need to manually putState all my profiles.
----
2018-08-30 18:35:05 UTC - Poule: then my state table `user_profile` would be my source of truth for profiles
+1 : Matteo Merli
----
2018-08-30 18:36:47 UTC - Sijie Guo: that’s a good feedback
----
2018-08-30 19:25:42 UTC - Poule: would it be possible to access state table from outside functions, example from any python script that can connect to Pulsar..
----
2018-08-30 19:39:27 UTC - Sijie Guo: @Poule so, the access to state can be done by bin/pulsar-admin. but ideally yes, clients can access state outside of functions from any language client. user can use the libraries that functions are using to talk to bk table store directly.
----
2018-08-30 19:48:46 UTC - Sijie Guo: @Poule I summarize the discussion here into a master github issue : <https://github.com/apache/incubator-pulsar/issues/2489> for tracking the tasks to satisfy your feedback
ok_hand : Poule
----
2018-08-30 19:49:04 UTC - Sijie Guo: if I missed anything feel free to comment in the issue as well
----
2018-08-30 23:21:41 UTC - Grant Wu: Anyone know how to debug `Error Checking/Getting Partition Metadata while creating producer on <persistent://public/default/job-status-updated>`
----
2018-08-30 23:21:54 UTC - Grant Wu: Appears to be coming from <https://github.com/apache/incubator-pulsar/blob/b6feac8180c978e3c59dfbcd4ef44494dd73b385/pulsar-client-cpp/lib/ClientImpl.cc#L159>
----
2018-08-31 00:08:54 UTC - Matteo Merli: @Grant Wu What is the reported error code?
----
2018-08-31 00:11:14 UTC - Grant Wu: Sorry where can I find that?
----
2018-08-31 00:11:47 UTC - Matteo Merli: After the error message it should return the `Result` code when creating the producer
----
2018-08-31 00:11:48 UTC - Grant Wu: 2018-08-30 23:19:11.342 ERROR ClientImpl:158 | Error Checking/Getting Partition Metadata while creating producer on <persistent://public/default/job-status-updated> -- 5
----
2018-08-31 00:11:55 UTC - Grant Wu: does that help?
----
2018-08-31 00:11:57 UTC - Matteo Merli: 5 ConnectError
----
2018-08-31 00:12:54 UTC - Grant Wu: Hrm
----
2018-08-31 00:13:02 UTC - Grant Wu: Yeah I think our docker-compose is completely broken
----
2018-08-31 00:13:04 UTC - Matteo Merli: can you share the rest of the log ? (or: can you check if the broker logs have something? )
----
2018-08-31 00:15:03 UTC - Grant Wu: hrm, client logs, right?
----
2018-08-31 00:15:25 UTC - Matteo Merli: yes, that first
----
2018-08-31 00:16:34 UTC - Grant Wu: It might just be a configuration issue on our end
----
2018-08-31 00:16:45 UTC - Grant Wu: just a sec
----
2018-08-31 00:18:13 UTC - Grant Wu: ```
2018-08-31 00:16:48.960 INFO  ConnectionPool:63 | Created connection for <pulsar://pulsar-broker:6650> 2018-08-31 00:16:48.965 ERROR ClientConnection:392 | [&lt;none&gt; -&gt; <pulsar://pulsar-broker:6650>] Resolve error: asio.netdb:1 : Host not found (authoritative)
2018-08-31 00:16:48.965 INFO  ClientConnection:1182 | [&lt;none&gt; -&gt; <pulsar://pulsar-broker:6650>] Connection closed
2018-08-31 00:16:48.965 ERROR ClientImpl:158 | Error Checking/Getting Partition Metadata while creating producer on <persistent://public/default/job-status-updated> -- 5
2018-08-31 00:16:48.965 INFO  ClientConnection:195 | [&lt;none&gt; -&gt; <pulsar://pulsar-broker:6650>] Destroyed connection
graphql.error.located_error.GraphQLLocatedError: Pulsar error: ConnectError
```
----
2018-08-31 00:19:00 UTC - Matteo Merli: yes, `pulsar-broker` is not resolved by DNS
----
2018-08-31 00:19:05 UTC - Grant Wu: Our networking might just be completely misconfigured
----
2018-08-31 01:54:24 UTC - Kesav Kolla: @Matteo Merli Thanks for the suggestion.  Unfortunately my client is python is there anyway to pass that auth config from python client?
----
2018-08-31 01:55:58 UTC - Matteo Merli: I am afraid not yet :confused:
----
2018-08-31 01:56:26 UTC - Matteo Merli: It would be easy to add, but it’s not in there yet
----
2018-08-31 02:03:14 UTC - Kesav Kolla: I'm running the pulsar `apachepulsar/pulsar` docker image which is basically standalone pulsar all-in-one server.  My container is keep getting shutdown after certain period of time.  I don't know what's the underlying reason?  In logs I see received shutdown hook.  Is there any inactivity time etc... that automatically broker shutsdown?
----
2018-08-31 02:03:37 UTC - Kesav Kolla: Shall I create an issue?
----
2018-08-31 02:09:26 UTC - Kesav Kolla: This is the last log
----