You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pulsar.apache.org by Apache Pulsar Slack <ap...@gmail.com> on 2020/04/22 09:11:04 UTC

Slack digest for #general - 2020-04-22

2020-04-21 09:28:22 UTC - Sijie Guo: When you create pulsar function, you need to use multipart to upload the function jar file. The last comment in that issue gives an example of how it was done in pulsarctl.
----
2020-04-21 09:47:33 UTC - Konstantinos Papalias: FYI @Dan Kitchen (TCEU)  ^
----
2020-04-21 13:33:44 UTC - Sankararao Routhu: Hi We are trying to deploy pulsar on AWS. We deployed zookeepers( 3 in west and 3 in east) and they are able to connect to each other. Now We are deploying book keeper on another node in same VPC but while starting the book keeper we are getting following error. Can you please help us to debug this issue
----
2020-04-21 13:34:02 UTC - Sankararao Routhu: ```13:31:26.613 [main-EventThread] INFO  org.apache.bookkeeper.zookeeper.ZooKeeperWatcherBase - ZooKeeper client is connected now.
13:31:27.066 [main] WARN  org.apache.bookkeeper.util.EventLoopUtil - Could not use Netty Epoll event loop: failed to load the required native library
13:31:27.176 [main] INFO  org.apache.bookkeeper.client.BookKeeper - Weighted ledger placement is not enabled
13:31:27.211 [main] ERROR org.apache.bookkeeper.client.BookieWatcherImpl - Failed to get bookie list : 
org.apache.bookkeeper.client.BKException$ZKException: Error while using ZooKeeper
	at org.apache.bookkeeper.discover.ZKRegistrationClient.lambda$getChildren$0(ZKRegistrationClient.java:220) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
	at org.apache.bookkeeper.zookeeper.ZooKeeperClient$25$1.processResult(ZooKeeperClient.java:1174) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
	at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:630) ~[org.apache.pulsar-pulsar-zookeeper-2.5.0.jar:2.5.0]
	at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:510) ~[org.apache.pulsar-pulsar-zookeeper-2.5.0.jar:2.5.0]
Exception in thread "main" com.google.common.util.concurrent.UncheckedExecutionException: Error while using ZooKeeper
	at org.apache.bookkeeper.tools.cli.commands.bookie.SanityTestCommand.apply(SanityTestCommand.java:80)
	at org.apache.bookkeeper.bookie.BookieShell$BookieSanityTestCmd.runCmd(BookieShell.java:887)
	at org.apache.bookkeeper.bookie.BookieShell$MyCommand.runCmd(BookieShell.java:223)
	at org.apache.bookkeeper.bookie.BookieShell.run(BookieShell.java:1976)
	at org.apache.bookkeeper.bookie.BookieShell.main(BookieShell.java:2067)
Caused by: org.apache.bookkeeper.client.BKException$ZKException: Error while using ZooKeeper
	at org.apache.bookkeeper.discover.ZKRegistrationClient.lambda$getChildren$0(ZKRegistrationClient.java:220)
	at org.apache.bookkeeper.zookeeper.ZooKeeperClient$25$1.processResult(ZooKeeperClient.java:1174)
	at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:630)
	at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:510)```
----
2020-04-21 13:37:08 UTC - Sankararao Routhu: snipper of <http://bookkeeper.co|bookkeeper.conf>
```zkServers=&lt;z.ip1&gt;:2181,&lt;ia.p2&gt;:2181,&lt;a.ip3&gt;:2181
extraServerComponents=org.apache.bookkeeper.stream.server.StreamStorageLifecycleComponent
dbStorage_writeCacheMaxSizeMb: "2048" # Write cache size (direct memory)
  dbStorage_readAheadCacheMaxSizeMb: "2048" # Read cache size (direct memory)
  dbStorage_rocksDB_blockCacheSize: "4294967296"
  journalMaxSizeMB: "2048"
  statsProviderClass: org.apache.bookkeeper.stats.prometheus.PrometheusMetricsProvider
  useHostNameAsBookieID: "false"
  advertisedAddress	:&lt;book keeper ip&gt; 
  allowLoopback:"false"```
----
2020-04-21 14:06:57 UTC - FSI: @FSI has joined the channel
----
2020-04-21 15:14:36 UTC - Curtis Cook: Looks like this has been released, thanks!
----
2020-04-21 18:59:51 UTC - JG: Hello guys, it could be a "rookie" question, but what is a difference between a subscription and a topic and why we have to define a subscription on a consumer ? Subscription should be always different ???
----
2020-04-21 19:00:37 UTC - Matteo Merli: A subscription is used to keep your position while consuming from a topic
----
2020-04-21 19:01:07 UTC - Matteo Merli: You can have different applications consuming the same data, using different subscriptions (each with its own name)
----
2020-04-21 19:11:30 UTC - JG: ah ok so I can have a topic and 2 apps have 1 subscription each one
----
2020-04-21 19:11:50 UTC - JG: topic:  test-topic  and subscription: app1 / app2
----
2020-04-21 19:12:37 UTC - JG: when u say keeping the position, you mean pointing to the correct offset ? ( stream )
----
2020-04-21 19:17:12 UTC - Matteo Merli: correct
----
2020-04-21 19:18:47 UTC - JG: imagine that we have a client in JS using websocket, then each user should have his own subscription ?
----
2020-04-21 19:33:45 UTC - Sam Leung: Hello, is there Pulsar support directly, or something someone has built on the existing Pulsar tools, to “shovel” all messages in one topic to another?
----
2020-04-21 19:42:25 UTC - Greg Methvin: This is just a degenerate case of Pulsar functions, where the function is the identity function.
----
2020-04-21 19:59:49 UTC - Matteo Merli: If they all want to get all the messages, yes
----
2020-04-21 20:18:46 UTC - JG: I understand thank you for your help!
----
2020-04-21 20:22:05 UTC - Mike Russell: Hey there, novice java user, really dumb question, once I checkout a certain tag and make sure the latest changes are pulled for the repo, does:
```mvn install -DskipTests```
actually build the binaries for /bin?
If it doesn’t how do I build the binaries for /bin?
Is there a maven lifecycle hook for it?
For instance I’m noticing that the binary of pulsar-admin functions localrun, doesn’t know of the options “--functionConfig” or “--sourceConfig” but I see them in the source code here:
<https://github.com/apache/pulsar/blob/v2.5.1/pulsar-functions/localrun/src/main/java/org/apache/pulsar/functions/LocalRunner.java>
----
2020-04-21 20:22:56 UTC - JG: Hi,
A quick question regarding the event stream storage, is it possible to know which message got a negative ack ?
I see: __value__              | __partition__ | __event_time__ |    __publish_time__     | __message_id__ | __sequence_id__ | __producer_name__ | __key__ | __properties__
But I don't see any status column ?
Here is the code for the idea:

```.subscriptionName("app1")
.messageListener((consumer, msg) -&gt; {
    System.out.println("RECEIVED MESSSAGE !!!!");
    try {
        <http://log.info|log.info>("Received message with an ID of {} and a payload of {}",
                msg.getMessageId().toString(),
                new String(msg.getData()));
        Integer a = null;
        a.toString();
        consumer.acknowledge(msg);
    } catch (Exception e){
        System.out.println("nb retries: "+msg.getRedeliveryCount());
        log.error("ACK PROB ---&gt; "+e.getMessage());

---&gt;     // add bad status on mesage
        consumer.negativeAcknowledge(msg);
    }
})
.subscribe();```

----
2020-04-21 20:24:01 UTC - Matteo Merli: &gt;  actually build the binaries for /bin?
Yes. After that you can call `bin/pulsar-admin` and the other commands.

It also creates the tar.gz which eventually gets released. That would be under `distribution/server/target/...`
----
2020-04-21 20:24:50 UTC - Matteo Merli: No, the messages themselves are immutable
----
2020-04-21 20:25:47 UTC - Matteo Merli: You can route the messages to another topic (either manually or through the dead-letter-queue) to check on the failed messages
----
2020-04-21 20:35:13 UTC - JG: yes u right! my goal was to do like event-sourcing or event-streaming, I would like to only keep sucessfull events and put bad ones in another topic maybe
----
2020-04-21 20:38:47 UTC - Mike Russell: thanks @Matteo Merli!
So for instance, I built it, then created a temporary directory and copied over:
```target/apache-pulsar-2.5.1-bin.tar.gz```
and then:
```tar xvf apache-pulsar-2.5.1-bin.tar.gz```
then try to:
```cd /Users/first.last/Projects/pulsar/temp_bin/apache-pulsar-2.5.1/bin

./pulsar-admin functions localrun --functionConfig xx.json --sourceConfig yy.json```
it still spits back:
```Warning: Nashorn engine is planned to be removed from a future JDK release
Unknown option: --functionConfig```
Am I wrong to believe that it should have that option from this:
<https://github.com/apache/pulsar/blob/v2.5.1/pulsar-functions/localrun/src/main/java/org/apache/pulsar/functions/LocalRunner.java>
----
2020-04-21 20:40:03 UTC - Matteo Merli: not sure about the specific option, you can try with `./pulsar-admin functions localrun --help`
----
2020-04-21 20:43:30 UTC - Mike Russell: ```Run a Pulsar Function locally, rather than deploy to a Pulsar cluster)
Usage: localrun [options]
  Options:
    --auto-ack
       Whether or not the framework acknowledges messages automatically
    --broker-service-url
       The URL for Pulsar broker
    --classname
       The class name of a Pulsar Function
    --client-auth-params
       Client authentication param
    --client-auth-plugin
       Client authentication plugin using which function-process can connect to
       broker
    --cpu
       The cpu in cores that need to be allocated per function
       instance(applicable only to docker runtime)
    --custom-runtime-options
       A string that encodes options to customize the runtime, see docs for
       configured runtime for details
    --custom-schema-inputs
       The map of input topics to Schema class names (as a JSON string)
    --custom-serde-inputs
       The map of input topics to SerDe class names (as a JSON string)
    --dead-letter-topic
       The topic where messages that are not processed successfully are sent to
    --disk
       The disk in bytes that need to be allocated per function
       instance(applicable only to docker runtime)
    --fqfn
       The Fully Qualified Function Name (FQFN) for the function
    --function-config-file
       The path to a YAML config file that specifies the configuration of a
       Pulsar Function
    --go
       Path to the main Go executable binary for the function (if the function
       is written in Go)
    --hostname-verification-enabled
       Enable hostname verification
       Default: false
    -i, --inputs
       The input topic or topics (multiple topics can be specified as a
       comma-separated list) of a Pulsar Function
    --instance-id-offset
       Start the instanceIds from this offset
       Default: 0
    --jar
       Path to the JAR file for the function (if the function is written in
       Java). It also supports URL path [http/https/file (file protocol assumes that
       file already exists on worker host)] from which worker can download the
       package.
    --log-topic
       The topic to which the logs of a Pulsar Function are produced
    --max-message-retries
       How many times should we try to process a message before giving up
    --name
       The name of a Pulsar Function
    --namespace
       The namespace of a Pulsar Function
    -o, --output
       The output topic of a Pulsar Function (If none is specified, no output is
       written)
    --output-serde-classname
       The SerDe class to be used for messages output by the function
    --parallelism
       The parallelism factor of a Pulsar Function (i.e. the number of function
       instances to run)
    --processing-guarantees
       The processing guarantees (aka delivery semantics) applied to the
       function
       Possible Values: [ATLEAST_ONCE, ATMOST_ONCE, EFFECTIVELY_ONCE]
    --py
       Path to the main Python file/Python Wheel file for the function (if the
       function is written in Python)
    --ram
       The ram in bytes that need to be allocated per function
       instance(applicable only to process/docker runtime)
    --retain-ordering
       Function consumes and processes messages in order
    -st, --schema-type
       The builtin schema type or custom schema class name to be used for
       messages output by the function
       Default: &lt;empty string&gt;
    --sliding-interval-count
       The number of messages after which the window slides
    --sliding-interval-duration-ms
       The time duration after which the window slides
    --state-storage-service-url
       The URL for the state storage service (the default is Apache BookKeeper)
    --subs-name
       Pulsar source subscription name if user wants a specific
       subscription-name for input-topic consumer
    --tenant
       The tenant of a Pulsar Function
    --timeout-ms
       The message timeout in milliseconds
    --tls-allow-insecure
       Allow insecure tls connection

       Default: false
    --tls-trust-cert-path
       tls trust cert file path
    --topics-pattern
       The topic pattern to consume from list of topics under a namespace that
       match the pattern. [--input] and [--topic-pattern] are mutually exclusive. Add
       SerDe class name for a pattern in --custom-serde-inputs (supported for java fun
       only)
    --use-tls
       Use tls connection

       Default: false
    --user-config
       User-defined config key/values
    --window-length-count
       The number of messages per window
    --window-length-duration-ms
       The time duration of the window in milliseconds```
----
2020-04-21 20:43:30 UTC - Mike Russell: :confused: sadly yeah I don’t see that option… and it’s using the 2.5.1 bin..
I guess my head is lost in understanding if the localrunner.java file “truely” is the thing that runs when you call:
pulsar-admin functions localrun
----
2020-04-21 20:45:08 UTC - Matteo Merli: `    --function-config-file` ?
----
2020-04-21 20:45:52 UTC - Sam Leung: That certainly is one way to do it, but we currently do not have Pulsar Functions enabled, was just wondering if anyone has explored other methods, specifically for administrative, one-time operations.
----
2020-04-21 20:46:17 UTC - Matteo Merli: It can be done automatically: <https://pulsar.apache.org/docs/en/concepts-messaging/#dead-letter-topic>
----
2020-04-21 20:47:03 UTC - Matteo Merli: basically, after N negativeAcks, a message will get sent to the DLQ
----
2020-04-21 20:51:43 UTC - Mike Russell: hmmmm yeah I guess that’s all that’s allowed is the yaml

I got caught up thinking it would be possible with the json for configuration… but even looking at 2.5.0 tag on that LocalRunner.java file it looks like both --functionConfig / --sourceConfig where there so they’re not new

So LocalRunner.java must be for something vastly different instead of “functions localrun”
…
but I don’t see “--function-config-file” for the yaml stuff inside the LocalRunner.java file
----
2020-04-21 20:51:45 UTC - Mike Russell: oh wait!
----
2020-04-21 20:51:49 UTC - Mike Russell: it’s not “pulsar-admin”
----
2020-04-21 20:51:49 UTC - JG: yes but dead topics only works with SHARED subscription, my problem is that I need to keep message ordering and only Esclusive subscirption can do that.
----
2020-04-21 20:51:54 UTC - Mike Russell: it’s “function-localrunner”
----
2020-04-21 20:52:02 UTC - Mike Russell: ^ that’s the one that has that option
----
2020-04-21 20:52:03 UTC - Mike Russell: interesting
----
2020-04-21 20:52:04 UTC - Mike Russell: haha
----
2020-04-21 20:52:58 UTC - Matteo Merli: ok, then the question on why we're using 2 different conventions for the CLI args... we'll leave it for another day :slightly_smiling_face:
----
2020-04-21 20:53:35 UTC - JG: I need the message ordering guarentee because when I will replay these events it should re-create the model ( a "delete" cannot be before a "new", .. ).
----
2020-04-21 21:02:45 UTC - Matteo Merli: If you need ordering, then you will have to keep trying (forever) to process a specific message. If you "give up" after N retries, you're implicitly breaking the ordering
----
2020-04-21 21:06:35 UTC - Greg Methvin: I’m not aware of any other “built-in” way to do it. If you can’t run the function on the broker you can run it separately: <http://pulsar.apache.org/docs/en/functions-worker/#configure-functions-worker-to-run-separately>
----
2020-04-21 21:07:39 UTC - JG: This is true, thats why there is no sens to keep error messages, I could just throw them away after a quick retry and thats it
----
2020-04-21 23:05:08 UTC - Gurpartap Singh: @Gurpartap Singh has joined the channel
----
2020-04-21 23:46:30 UTC - Ben: @Ben has joined the channel
----
2020-04-22 02:24:29 UTC - Ayyanar: @Ayyanar has joined the channel
----
2020-04-22 02:53:14 UTC - Alexander Ursu: Hi, was wondering if the streamnative Terraform provider (<https://github.com/streamnative/terraform-provider-pulsar>) is still maintained, and if there are any plans to add more features, like provisioning offloading, sources, sinks, etc.
----