You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pulsar.apache.org by Apache Pulsar Slack <ap...@gmail.com> on 2020/04/22 09:11:04 UTC
Slack digest for #general - 2020-04-22
2020-04-21 09:28:22 UTC - Sijie Guo: When you create pulsar function, you need to use multipart to upload the function jar file. The last comment in that issue gives an example of how it was done in pulsarctl.
----
2020-04-21 09:47:33 UTC - Konstantinos Papalias: FYI @Dan Kitchen (TCEU) ^
----
2020-04-21 13:33:44 UTC - Sankararao Routhu: Hi We are trying to deploy pulsar on AWS. We deployed zookeepers( 3 in west and 3 in east) and they are able to connect to each other. Now We are deploying book keeper on another node in same VPC but while starting the book keeper we are getting following error. Can you please help us to debug this issue
----
2020-04-21 13:34:02 UTC - Sankararao Routhu: ```13:31:26.613 [main-EventThread] INFO org.apache.bookkeeper.zookeeper.ZooKeeperWatcherBase - ZooKeeper client is connected now.
13:31:27.066 [main] WARN org.apache.bookkeeper.util.EventLoopUtil - Could not use Netty Epoll event loop: failed to load the required native library
13:31:27.176 [main] INFO org.apache.bookkeeper.client.BookKeeper - Weighted ledger placement is not enabled
13:31:27.211 [main] ERROR org.apache.bookkeeper.client.BookieWatcherImpl - Failed to get bookie list :
org.apache.bookkeeper.client.BKException$ZKException: Error while using ZooKeeper
at org.apache.bookkeeper.discover.ZKRegistrationClient.lambda$getChildren$0(ZKRegistrationClient.java:220) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
at org.apache.bookkeeper.zookeeper.ZooKeeperClient$25$1.processResult(ZooKeeperClient.java:1174) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:630) ~[org.apache.pulsar-pulsar-zookeeper-2.5.0.jar:2.5.0]
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:510) ~[org.apache.pulsar-pulsar-zookeeper-2.5.0.jar:2.5.0]
Exception in thread "main" com.google.common.util.concurrent.UncheckedExecutionException: Error while using ZooKeeper
at org.apache.bookkeeper.tools.cli.commands.bookie.SanityTestCommand.apply(SanityTestCommand.java:80)
at org.apache.bookkeeper.bookie.BookieShell$BookieSanityTestCmd.runCmd(BookieShell.java:887)
at org.apache.bookkeeper.bookie.BookieShell$MyCommand.runCmd(BookieShell.java:223)
at org.apache.bookkeeper.bookie.BookieShell.run(BookieShell.java:1976)
at org.apache.bookkeeper.bookie.BookieShell.main(BookieShell.java:2067)
Caused by: org.apache.bookkeeper.client.BKException$ZKException: Error while using ZooKeeper
at org.apache.bookkeeper.discover.ZKRegistrationClient.lambda$getChildren$0(ZKRegistrationClient.java:220)
at org.apache.bookkeeper.zookeeper.ZooKeeperClient$25$1.processResult(ZooKeeperClient.java:1174)
at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:630)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:510)```
----
2020-04-21 13:37:08 UTC - Sankararao Routhu: snipper of <http://bookkeeper.co|bookkeeper.conf>
```zkServers=<z.ip1>:2181,<ia.p2>:2181,<a.ip3>:2181
extraServerComponents=org.apache.bookkeeper.stream.server.StreamStorageLifecycleComponent
dbStorage_writeCacheMaxSizeMb: "2048" # Write cache size (direct memory)
dbStorage_readAheadCacheMaxSizeMb: "2048" # Read cache size (direct memory)
dbStorage_rocksDB_blockCacheSize: "4294967296"
journalMaxSizeMB: "2048"
statsProviderClass: org.apache.bookkeeper.stats.prometheus.PrometheusMetricsProvider
useHostNameAsBookieID: "false"
advertisedAddress :<book keeper ip>
allowLoopback:"false"```
----
2020-04-21 14:06:57 UTC - FSI: @FSI has joined the channel
----
2020-04-21 15:14:36 UTC - Curtis Cook: Looks like this has been released, thanks!
----
2020-04-21 18:59:51 UTC - JG: Hello guys, it could be a "rookie" question, but what is a difference between a subscription and a topic and why we have to define a subscription on a consumer ? Subscription should be always different ???
----
2020-04-21 19:00:37 UTC - Matteo Merli: A subscription is used to keep your position while consuming from a topic
----
2020-04-21 19:01:07 UTC - Matteo Merli: You can have different applications consuming the same data, using different subscriptions (each with its own name)
----
2020-04-21 19:11:30 UTC - JG: ah ok so I can have a topic and 2 apps have 1 subscription each one
----
2020-04-21 19:11:50 UTC - JG: topic: test-topic and subscription: app1 / app2
----
2020-04-21 19:12:37 UTC - JG: when u say keeping the position, you mean pointing to the correct offset ? ( stream )
----
2020-04-21 19:17:12 UTC - Matteo Merli: correct
----
2020-04-21 19:18:47 UTC - JG: imagine that we have a client in JS using websocket, then each user should have his own subscription ?
----
2020-04-21 19:33:45 UTC - Sam Leung: Hello, is there Pulsar support directly, or something someone has built on the existing Pulsar tools, to “shovel” all messages in one topic to another?
----
2020-04-21 19:42:25 UTC - Greg Methvin: This is just a degenerate case of Pulsar functions, where the function is the identity function.
----
2020-04-21 19:59:49 UTC - Matteo Merli: If they all want to get all the messages, yes
----
2020-04-21 20:18:46 UTC - JG: I understand thank you for your help!
----
2020-04-21 20:22:05 UTC - Mike Russell: Hey there, novice java user, really dumb question, once I checkout a certain tag and make sure the latest changes are pulled for the repo, does:
```mvn install -DskipTests```
actually build the binaries for /bin?
If it doesn’t how do I build the binaries for /bin?
Is there a maven lifecycle hook for it?
For instance I’m noticing that the binary of pulsar-admin functions localrun, doesn’t know of the options “--functionConfig” or “--sourceConfig” but I see them in the source code here:
<https://github.com/apache/pulsar/blob/v2.5.1/pulsar-functions/localrun/src/main/java/org/apache/pulsar/functions/LocalRunner.java>
----
2020-04-21 20:22:56 UTC - JG: Hi,
A quick question regarding the event stream storage, is it possible to know which message got a negative ack ?
I see: __value__ | __partition__ | __event_time__ | __publish_time__ | __message_id__ | __sequence_id__ | __producer_name__ | __key__ | __properties__
But I don't see any status column ?
Here is the code for the idea:
```.subscriptionName("app1")
.messageListener((consumer, msg) -> {
System.out.println("RECEIVED MESSSAGE !!!!");
try {
<http://log.info|log.info>("Received message with an ID of {} and a payload of {}",
msg.getMessageId().toString(),
new String(msg.getData()));
Integer a = null;
a.toString();
consumer.acknowledge(msg);
} catch (Exception e){
System.out.println("nb retries: "+msg.getRedeliveryCount());
log.error("ACK PROB ---> "+e.getMessage());
---> // add bad status on mesage
consumer.negativeAcknowledge(msg);
}
})
.subscribe();```
----
2020-04-21 20:24:01 UTC - Matteo Merli: > actually build the binaries for /bin?
Yes. After that you can call `bin/pulsar-admin` and the other commands.
It also creates the tar.gz which eventually gets released. That would be under `distribution/server/target/...`
----
2020-04-21 20:24:50 UTC - Matteo Merli: No, the messages themselves are immutable
----
2020-04-21 20:25:47 UTC - Matteo Merli: You can route the messages to another topic (either manually or through the dead-letter-queue) to check on the failed messages
----
2020-04-21 20:35:13 UTC - JG: yes u right! my goal was to do like event-sourcing or event-streaming, I would like to only keep sucessfull events and put bad ones in another topic maybe
----
2020-04-21 20:38:47 UTC - Mike Russell: thanks @Matteo Merli!
So for instance, I built it, then created a temporary directory and copied over:
```target/apache-pulsar-2.5.1-bin.tar.gz```
and then:
```tar xvf apache-pulsar-2.5.1-bin.tar.gz```
then try to:
```cd /Users/first.last/Projects/pulsar/temp_bin/apache-pulsar-2.5.1/bin
./pulsar-admin functions localrun --functionConfig xx.json --sourceConfig yy.json```
it still spits back:
```Warning: Nashorn engine is planned to be removed from a future JDK release
Unknown option: --functionConfig```
Am I wrong to believe that it should have that option from this:
<https://github.com/apache/pulsar/blob/v2.5.1/pulsar-functions/localrun/src/main/java/org/apache/pulsar/functions/LocalRunner.java>
----
2020-04-21 20:40:03 UTC - Matteo Merli: not sure about the specific option, you can try with `./pulsar-admin functions localrun --help`
----
2020-04-21 20:43:30 UTC - Mike Russell: ```Run a Pulsar Function locally, rather than deploy to a Pulsar cluster)
Usage: localrun [options]
Options:
--auto-ack
Whether or not the framework acknowledges messages automatically
--broker-service-url
The URL for Pulsar broker
--classname
The class name of a Pulsar Function
--client-auth-params
Client authentication param
--client-auth-plugin
Client authentication plugin using which function-process can connect to
broker
--cpu
The cpu in cores that need to be allocated per function
instance(applicable only to docker runtime)
--custom-runtime-options
A string that encodes options to customize the runtime, see docs for
configured runtime for details
--custom-schema-inputs
The map of input topics to Schema class names (as a JSON string)
--custom-serde-inputs
The map of input topics to SerDe class names (as a JSON string)
--dead-letter-topic
The topic where messages that are not processed successfully are sent to
--disk
The disk in bytes that need to be allocated per function
instance(applicable only to docker runtime)
--fqfn
The Fully Qualified Function Name (FQFN) for the function
--function-config-file
The path to a YAML config file that specifies the configuration of a
Pulsar Function
--go
Path to the main Go executable binary for the function (if the function
is written in Go)
--hostname-verification-enabled
Enable hostname verification
Default: false
-i, --inputs
The input topic or topics (multiple topics can be specified as a
comma-separated list) of a Pulsar Function
--instance-id-offset
Start the instanceIds from this offset
Default: 0
--jar
Path to the JAR file for the function (if the function is written in
Java). It also supports URL path [http/https/file (file protocol assumes that
file already exists on worker host)] from which worker can download the
package.
--log-topic
The topic to which the logs of a Pulsar Function are produced
--max-message-retries
How many times should we try to process a message before giving up
--name
The name of a Pulsar Function
--namespace
The namespace of a Pulsar Function
-o, --output
The output topic of a Pulsar Function (If none is specified, no output is
written)
--output-serde-classname
The SerDe class to be used for messages output by the function
--parallelism
The parallelism factor of a Pulsar Function (i.e. the number of function
instances to run)
--processing-guarantees
The processing guarantees (aka delivery semantics) applied to the
function
Possible Values: [ATLEAST_ONCE, ATMOST_ONCE, EFFECTIVELY_ONCE]
--py
Path to the main Python file/Python Wheel file for the function (if the
function is written in Python)
--ram
The ram in bytes that need to be allocated per function
instance(applicable only to process/docker runtime)
--retain-ordering
Function consumes and processes messages in order
-st, --schema-type
The builtin schema type or custom schema class name to be used for
messages output by the function
Default: <empty string>
--sliding-interval-count
The number of messages after which the window slides
--sliding-interval-duration-ms
The time duration after which the window slides
--state-storage-service-url
The URL for the state storage service (the default is Apache BookKeeper)
--subs-name
Pulsar source subscription name if user wants a specific
subscription-name for input-topic consumer
--tenant
The tenant of a Pulsar Function
--timeout-ms
The message timeout in milliseconds
--tls-allow-insecure
Allow insecure tls connection
Default: false
--tls-trust-cert-path
tls trust cert file path
--topics-pattern
The topic pattern to consume from list of topics under a namespace that
match the pattern. [--input] and [--topic-pattern] are mutually exclusive. Add
SerDe class name for a pattern in --custom-serde-inputs (supported for java fun
only)
--use-tls
Use tls connection
Default: false
--user-config
User-defined config key/values
--window-length-count
The number of messages per window
--window-length-duration-ms
The time duration of the window in milliseconds```
----
2020-04-21 20:43:30 UTC - Mike Russell: :confused: sadly yeah I don’t see that option… and it’s using the 2.5.1 bin..
I guess my head is lost in understanding if the localrunner.java file “truely” is the thing that runs when you call:
pulsar-admin functions localrun
----
2020-04-21 20:45:08 UTC - Matteo Merli: ` --function-config-file` ?
----
2020-04-21 20:45:52 UTC - Sam Leung: That certainly is one way to do it, but we currently do not have Pulsar Functions enabled, was just wondering if anyone has explored other methods, specifically for administrative, one-time operations.
----
2020-04-21 20:46:17 UTC - Matteo Merli: It can be done automatically: <https://pulsar.apache.org/docs/en/concepts-messaging/#dead-letter-topic>
----
2020-04-21 20:47:03 UTC - Matteo Merli: basically, after N negativeAcks, a message will get sent to the DLQ
----
2020-04-21 20:51:43 UTC - Mike Russell: hmmmm yeah I guess that’s all that’s allowed is the yaml
I got caught up thinking it would be possible with the json for configuration… but even looking at 2.5.0 tag on that LocalRunner.java file it looks like both --functionConfig / --sourceConfig where there so they’re not new
So LocalRunner.java must be for something vastly different instead of “functions localrun”
…
but I don’t see “--function-config-file” for the yaml stuff inside the LocalRunner.java file
----
2020-04-21 20:51:45 UTC - Mike Russell: oh wait!
----
2020-04-21 20:51:49 UTC - Mike Russell: it’s not “pulsar-admin”
----
2020-04-21 20:51:49 UTC - JG: yes but dead topics only works with SHARED subscription, my problem is that I need to keep message ordering and only Esclusive subscirption can do that.
----
2020-04-21 20:51:54 UTC - Mike Russell: it’s “function-localrunner”
----
2020-04-21 20:52:02 UTC - Mike Russell: ^ that’s the one that has that option
----
2020-04-21 20:52:03 UTC - Mike Russell: interesting
----
2020-04-21 20:52:04 UTC - Mike Russell: haha
----
2020-04-21 20:52:58 UTC - Matteo Merli: ok, then the question on why we're using 2 different conventions for the CLI args... we'll leave it for another day :slightly_smiling_face:
----
2020-04-21 20:53:35 UTC - JG: I need the message ordering guarentee because when I will replay these events it should re-create the model ( a "delete" cannot be before a "new", .. ).
----
2020-04-21 21:02:45 UTC - Matteo Merli: If you need ordering, then you will have to keep trying (forever) to process a specific message. If you "give up" after N retries, you're implicitly breaking the ordering
----
2020-04-21 21:06:35 UTC - Greg Methvin: I’m not aware of any other “built-in” way to do it. If you can’t run the function on the broker you can run it separately: <http://pulsar.apache.org/docs/en/functions-worker/#configure-functions-worker-to-run-separately>
----
2020-04-21 21:07:39 UTC - JG: This is true, thats why there is no sens to keep error messages, I could just throw them away after a quick retry and thats it
----
2020-04-21 23:05:08 UTC - Gurpartap Singh: @Gurpartap Singh has joined the channel
----
2020-04-21 23:46:30 UTC - Ben: @Ben has joined the channel
----
2020-04-22 02:24:29 UTC - Ayyanar: @Ayyanar has joined the channel
----
2020-04-22 02:53:14 UTC - Alexander Ursu: Hi, was wondering if the streamnative Terraform provider (<https://github.com/streamnative/terraform-provider-pulsar>) is still maintained, and if there are any plans to add more features, like provisioning offloading, sources, sinks, etc.
----