You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pulsar.apache.org by Apache Pulsar Slack <ap...@gmail.com> on 2020/08/05 09:11:04 UTC

Slack digest for #general - 2020-08-05

2020-08-04 10:18:54 UTC - Ermir Zaimi: i try to create a pulsar function in standalone mode, using just pulsar admin and i get the same error. how can i check if pulsar functions is configured correctly,?
----
2020-08-04 11:38:23 UTC - Gilles Barbier: Hi, does anyone use function state in production? I'm asking because we quickly had some issues described here (<https://apache-pulsar.slack.com/archives/C5Z4T36F7/p1593169043352700|https://apache-pulsar.slack.com/archives/C5Z4T36F7/p1593169043352700>).
----
2020-08-04 11:50:09 UTC - Caleb Epstein: @Addison Higham thanks!  I'll submit a pull request to update the docs on those functions in the C++ client lib.
----
2020-08-04 11:51:13 UTC - Caleb Epstein: I think TTL of 7 days and a retention time of 3 days would work for that.
----
2020-08-04 12:12:18 UTC - Caleb Epstein: <https://github.com/apache/pulsar/pull/7745>
----
2020-08-04 12:49:31 UTC - markg: Hey all - just looking at pulsar to kafka comparisons, wondering if anyone can validate this post: <https://www.kai-waehner.de/blog/2020/06/09/apache-kafka-versus-apache-pulsar-event-streaming-comparison-features-myths-explored/>
----
2020-08-04 13:15:41 UTC - Caleb Epstein: "ZooKeeper is Kafka’s biggest scalability bottleneck and comes with operational challenges — This is true for Kafka but even more so for Pulsar!" - seems specious
----
2020-08-04 13:21:31 UTC - Caleb Epstein: The analysis seems a bit biased, especially given the author's bio says "<https://www.kai-waehner.de/blog/author/kai-waehner/|Kai Waehner>
builds mission-critical, scalable, streaming infrastructures with Apache Kafka" :slightly_smiling_face:
----
2020-08-04 13:44:02 UTC - Rahul Vashishth: How does the auto-scaling of zookeeper and bookie pods works? Do both bookie and zookeeper needs to maintain a quorum? Say if I have 4 bookie pods associated with 4 persistent volumes. With a write quorum set to 2. Is there a data loss if we reduce the bookie pod count to 3.
----
2020-08-04 13:47:41 UTC - Rahul Vashishth: How do folks estimate pulsar cluster sizing? How should we calculate VM sizes required for cluster for a given throughput, latency and concurrent connections(producers/consumers)?
----
2020-08-04 15:18:44 UTC - Alexandre DUVAL: Hi, standalone function worker restart doesn't restart all  "registered" functions? Should have multiple nodes to handle the restart case?
----
2020-08-04 15:25:37 UTC - Axel Sirota: @Jerry Peng maybe you know? :slightly_smiling_face:
----
2020-08-04 15:53:14 UTC - Addison Higham: @markg that article is... not great, certainly some of these are opinions, but there are other things that aren't very accurate. There was some discussion in the community if it would be worth writing a response post, but I think it started to feel a bit too much like an argument, so we would rather just continuing to educate about Pulsar and address questions that come from it directly :)

A few points to share though :
- Pulsar functions isn't trying to be what kafka streams is, it is intended to be for lightweight, simple use cases, use Flink for the the more general purpose stream processing
- Many of the benefits he discusses are really about Confluent Cloud (liked tiered storage and elasticity), which is fine if you want to use a vendor, but those features are core parts of open source Pulsar
- The characterization of the extra layer meaning much more resources, network hops, etc isn't entirely correct as it really depends on your use case. You don't read from bookkeeper in most cases of "tailing reads" (where you are caught up to the producer) that is all served out of the memory of a broker without hitting a bookkeeper, but in the case where you do need to read from bookkeeper for old data, the architecture of Pulsar also can have advantages in that write performance can be maintained even with lots of catch-up reads
- In talking about benchmarks, Pulsar defaults to stronger consistency guarantees, so configuring Kafka to match is more apples to apples in some sense. But yes, Kafka can have great performance, but it also comes with less guarantees of consistency. If you want to see what that means practically, these posts are a good comparison of the defaults:
<https://jack-vanlightly.com/blog/2018/9/14/how-to-lose-messages-on-a-kafka-cluster-part1>
<https://jack-vanlightly.com/blog/2018/10/21/how-to-not-lose-messages-on-an-apache-pulsar-cluster>
+1 : Caleb Epstein
----
2020-08-04 15:53:57 UTC - Addison Higham: there is a lot more that could be said, if you have any specific questions, I would be happy to answer them, but hopefully that gives you some idea of where that article can be misleading
----
2020-08-04 15:56:09 UTC - Addison Higham: that is a good idea
----
2020-08-04 15:56:26 UTC - Addison Higham: something we could probably do
----
2020-08-04 16:53:10 UTC - Addison Higham: that doesn't seem right... AFAIK one function worker basically is elected leader and does the scheduling anyways. Might be worth some more details an an issue?
----
2020-08-04 17:58:12 UTC - Pavel Krupets: @Pavel Krupets has joined the channel
----
2020-08-04 18:56:12 UTC - Devin G. Bost: Has anyone here successfully compiled Pulsar on a Windows machine? When I try to do that, it runs for 3 seconds and says the build was successful but doesn't actually build anything.
----
2020-08-04 19:06:21 UTC - Devin G. Bost: If I run it with `mvn install -U`, then it downloads the dependencies but then does nothing after that.
----
2020-08-04 19:09:22 UTC - Devin G. Bost: If I navigate into one of the project modules (such as `tests`), it runs as expected, but it fails because not all of the Pulsar dependencies were built.
----
2020-08-04 19:18:16 UTC - Jonathan Ellis: @Jonathan Ellis has joined the channel
----
2020-08-04 19:45:57 UTC - Jonathan Ellis: @Devin G. Bost I don't think native Windows is supported.  It builds for me under WSL but I get test failures
+1 : Devin G. Bost
----
2020-08-04 19:50:08 UTC - Jonathan Ellis: I guess this is my excuse to upgrade to WSL2
----
2020-08-04 21:31:20 UTC - Chris: Does pulsar support using custom coders in avro? I've looked through the code paths a bit and it deftly avoids both custom coders and 1.9.2's fastdecode option.
----
2020-08-04 23:53:38 UTC - Matt Mitchell: I’m seeing a potential memory issue in an application I’m working on, where `<http://org.apache.pulsar.shade.io|org.apache.pulsar.shade.io>.netty.util.HashedWheelTimer` has a single reference to `<http://org.apache.pulsar.shade.io|org.apache.pulsar.shade.io>.netty.util.HashedWheelTimer$HashedWheelBucket`, which has several references to the application’s objects. Is this expected / normal? I guess more of a netty thing than Pulsar, but thought I’d see if this looks familiar to anyone.
----
2020-08-05 00:30:27 UTC - Brandon Whitehead: how do I actually do this tutorial?
----
2020-08-05 00:30:34 UTC - Brandon Whitehead: <https://pulsar.apache.org/docs/en/standalone/>
----
2020-08-05 00:31:13 UTC - Brandon Whitehead: I have done the tarball part, but when I do "bin/pulsar standalone", I just open some file on Atom
----
2020-08-05 00:31:27 UTC - Brandon Whitehead: please, idk what to do
----
2020-08-05 00:34:58 UTC - Brandon Whitehead: apparently the command causes me to open the file in my "apache-pulsar-2.6.0" folder that goes to the path bin/pulsar
----
2020-08-05 00:35:13 UTC - Brandon Whitehead: why would the tutorial say to put in that command?
----
2020-08-05 00:37:03 UTC - Addison Higham: Try `./bin/pulsar standalone`
----
2020-08-05 00:37:34 UTC - Brandon Whitehead: does the exact same thing
----
2020-08-05 00:38:17 UTC - Brandon Whitehead: also I cant use the pulsar command by itself
----
2020-08-05 00:38:22 UTC - Brandon Whitehead: not even the pulsar_help
----
2020-08-05 00:39:24 UTC - Brandon Whitehead: did I download the wrong apache-pulsar link?
----
2020-08-05 00:39:37 UTC - Brandon Whitehead: <https://pulsar.apache.org/en/download/>
----
2020-08-05 00:39:58 UTC - Brandon Whitehead: I downloaded the "<https://www.apache.org/dyn/mirrors/mirrors.cgi?action=download&amp;filename=pulsar/pulsar-2.6.0/apache-pulsar-2.6.0-bin.tar.gz|apache-pulsar-2.6.0-bin.tar.gz>"
----
2020-08-05 00:40:07 UTC - Brandon Whitehead: then did the tar command in the tutorial
----
2020-08-05 00:40:22 UTC - Brandon Whitehead: then moved it to somewhere that wasn't my download folder
----
2020-08-05 00:40:58 UTC - Brandon Whitehead: still can't figure out what is actually wrong
----
2020-08-05 00:41:03 UTC - Addison Higham: what os are you using?
----
2020-08-05 00:41:07 UTC - Brandon Whitehead: windows
----
2020-08-05 00:41:11 UTC - Brandon Whitehead: 10 home
----
2020-08-05 00:43:24 UTC - Addison Higham: are you using powershell or windows subsystem for linux? that command is written in bash, so you a bash shell
----
2020-08-05 00:43:37 UTC - Brandon Whitehead: using powershell right now
----
2020-08-05 00:43:57 UTC - Brandon Whitehead: how do I use bash in windows?
----
2020-08-05 00:44:04 UTC - Brandon Whitehead: do I need to download something?
----
2020-08-05 00:46:18 UTC - Addison Higham: as is said here in the docs: <https://pulsar.apache.org/docs/en/standalone/#system-requirements> all of that pretty much assumes macos or linux, you can probably make WSL work, but I would suggest doing some reading on how WSL works. Another option would be virtualbox and a linux VM and to try out pulsar there
----
2020-08-05 00:47:05 UTC - Brandon Whitehead: well I'm downloading ubuntu on my machine right now
----
2020-08-05 00:47:11 UTC - Brandon Whitehead: so I can use bash
----
2020-08-05 00:47:22 UTC - Brandon Whitehead: will that work?
----
2020-08-05 00:48:30 UTC - Addison Higham: via WSL? or a VM? either either mode, you will still need to install a java distribution
----
2020-08-05 00:51:35 UTC - Brandon Whitehead: what is WSL?
----
2020-08-05 00:51:43 UTC - Brandon Whitehead: I dont think it's through a virtual machine
----
2020-08-05 00:52:00 UTC - Brandon Whitehead: also bash is open, but I'm not sure what to do now
----
2020-08-05 00:55:09 UTC - Brandon Whitehead: okay so I'm checking the file directory in bash, and I have no paths to anything
----
2020-08-05 00:55:45 UTC - Brandon Whitehead: I use "pwd" and just get "/home/typhon"
----
2020-08-05 00:55:53 UTC - Brandon Whitehead: and I use "ls" and get nothing
----
2020-08-05 00:59:38 UTC - Brandon Whitehead: I'm surprised only one person is here
----
2020-08-05 00:59:52 UTC - Brandon Whitehead: to help me out :c
----
2020-08-05 01:01:50 UTC - Brandon Whitehead: so I downloaded ubuntu to use bash
----
2020-08-05 01:02:00 UTC - Brandon Whitehead: because apparently that command is a bash command only
----
2020-08-05 01:02:38 UTC - Brandon Whitehead: but now using bash isn't getting me any file directories to use, as if I'm not using bash within my own machine
----
2020-08-05 01:02:48 UTC - Brandon Whitehead: is it a VM that I unintentionally made?
----
2020-08-05 01:02:53 UTC - Brandon Whitehead: I cant tell
----
2020-08-05 01:08:03 UTC - Brandon Whitehead: okay, I downloaded the oracle java from the system requirements that were on the tutorial
----
2020-08-05 01:08:05 UTC - Brandon Whitehead: <https://www.oracle.com/java/technologies/javase-jdk14-downloads.html>
----
2020-08-05 01:08:21 UTC - Brandon Whitehead: specifically the windows x64 installer
----
2020-08-05 01:08:33 UTC - Brandon Whitehead: so what should change?
----
2020-08-05 01:08:38 UTC - Brandon Whitehead: can I use bash now?
----
2020-08-05 01:09:40 UTC - Ali Ahmed: @Brandon Whitehead what are you trying to do brandon ?
----
2020-08-05 01:10:02 UTC - Brandon Whitehead: I just want to do this tutorial
----
2020-08-05 01:10:03 UTC - Brandon Whitehead: <https://pulsar.apache.org/docs/en/standalone/>
----
2020-08-05 01:10:22 UTC - Ali Ahmed: are you using linux or mac ?
----
2020-08-05 01:10:26 UTC - Brandon Whitehead: windows
----
2020-08-05 01:11:05 UTC - Ali Ahmed: for windows I would recommend using docker since the artifacts are developed with unix in mind, that’s the simplest things to do
----
2020-08-05 01:11:14 UTC - Brandon Whitehead: I've been trying to use docker
----
2020-08-05 01:11:19 UTC - Brandon Whitehead: it doesn't work
----
2020-08-05 01:11:41 UTC - Brandon Whitehead: specifically, I use the command in that tutorial here
----
2020-08-05 01:11:42 UTC - Brandon Whitehead: <https://pulsar.apache.org/docs/en/standalone-docker/>
----
2020-08-05 01:11:47 UTC - Brandon Whitehead: the starting one
----
2020-08-05 01:12:06 UTC - Brandon Whitehead: and when I use the docker desktop
----
2020-08-05 01:12:22 UTC - Brandon Whitehead: and try to open via browser, it returns nothing
----
2020-08-05 01:12:23 UTC - Ali Ahmed: ```docker run -it -p 6650:6650 -p 8080:8080 apachepulsar/pulsar:2.6.0 bin/pulsar standalone```
----
2020-08-05 01:12:28 UTC - Ali Ahmed: try this
----
2020-08-05 01:12:37 UTC - Brandon Whitehead: that's exactly what I've been doing
----
2020-08-05 01:12:49 UTC - Brandon Whitehead: I've tried it multiple times
----
2020-08-05 01:13:03 UTC - Ali Ahmed: what do you see here
```<http://localhost:8080/>```
----
2020-08-05 01:14:13 UTC - Brandon Whitehead: http error 404
----
2020-08-05 01:14:38 UTC - Brandon Whitehead: Problem accessing /. Reason:
```    Not Found```

----
2020-08-05 01:14:56 UTC - Brandon Whitehead: then a "Powered by Jetty" thing at the bottom
----
2020-08-05 01:15:00 UTC - Brandon Whitehead: nothing else
----
2020-08-05 01:15:21 UTC - Ali Ahmed: it’s working try this
```<http://localhost:8080/metrics/>```
----
2020-08-05 01:16:10 UTC - Brandon Whitehead: okay I see a bunch of lines
----
2020-08-05 01:16:25 UTC - Brandon Whitehead: seem like stuff that comes through the docker desktop, or very close
----
2020-08-05 01:16:54 UTC - Brandon Whitehead: lot of zookeeper stuff
----
2020-08-05 01:17:04 UTC - Brandon Whitehead: and caffeine cache
----
2020-08-05 01:17:10 UTC - Ali Ahmed: pulsar is live you can use either admin cli to connect or send messages via binary api
----
2020-08-05 01:17:41 UTC - Brandon Whitehead: so it's working?
----
2020-08-05 01:17:44 UTC - Ali Ahmed: yes
----
2020-08-05 01:18:08 UTC - Brandon Whitehead: okay so I'm trying to do the pulsar go tutorial here
----
2020-08-05 01:18:09 UTC - Brandon Whitehead: <https://pulsar.apache.org/docs/en/client-libraries-go/>
----
2020-08-05 01:18:30 UTC - Brandon Whitehead: when I run this program while the docker desktop is still running, it won't connect for some reason
----
2020-08-05 01:19:14 UTC - Brandon Whitehead: ```package main;

import (
    "log"
    "fmt"
    "time"
    "context"

    "<http://github.com/apache/pulsar-client-go/pulsar|github.com/apache/pulsar-client-go/pulsar>"
)

func main() {
    client, err := pulsar.NewClient(pulsar.ClientOptions{
        URL:               "<pulsar://localhost:8080>",
        OperationTimeout:  60 * time.Second,
        ConnectionTimeout: 60 * time.Second,
    })
    if err != nil {
        log.Fatalf("Could not instantiate Pulsar client: %v", err)
    } else {
      log.Printf("Success!")
    }

    defer client.Close()

    producer, err1 := client.CreateProducer(pulsar.ProducerOptions{
      Topic: "my-topic",
      Name: "test-producer",
    })

    if err1 != nil {
        log.Fatalf("Could not connect to client: %v", err1)
    } else {
      log.Printf("Success!")
    }

    _, err2 := producer.Send(context.Background(), &amp;pulsar.ProducerMessage{
      Payload: []byte("hello"),
    })

    defer producer.Close()

    if err2 != nil {
      fmt.Println("Failed to publish message", err2)
    } else {
      log.Printf("Success!")
    }

    fmt.Println("Published message")

    consumer, err := client.Subscribe(pulsar.ConsumerOptions{
      Topic:            "topic-1",
      SubscriptionName: "my-sub",
      Type:             pulsar.Shared,
    })
    if err != nil {
      log.Fatal(err)
    }
    defer consumer.Close()

    for i := 0; i &lt; 10; i++ {
      msg, err := consumer.Receive(context.Background())
      if err != nil {
          log.Fatal(err)
        } else {
          log.Printf("Success!")
        }

        fmt.Printf("Received message msgId: %#v -- content: '%s'\n",
          msg.ID(), string(msg.Payload()))

        consumer.Ack(msg)
      }

      if err := consumer.Unsubscribe(); err != nil {
        log.Fatal(err)
      }
}```
----
2020-08-05 01:19:41 UTC - Ali Ahmed: you can’t connect to admin interface
``` "<pulsar://localhost:8080>"```
----
2020-08-05 01:19:44 UTC - Brandon Whitehead: it says the client is being made, but when it gets to creating the producer, it repeatedly tries to connect to broker
----
2020-08-05 01:19:59 UTC - Ali Ahmed: you have to connect to the binary port
----
2020-08-05 01:21:07 UTC - Brandon Whitehead: so the 6650?
----
2020-08-05 01:21:10 UTC - Ali Ahmed: yes
----
2020-08-05 01:21:26 UTC - Ali Ahmed: it you uses the same docker command I did
----
2020-08-05 01:21:34 UTC - Brandon Whitehead: okay it seems to be working
----
2020-08-05 01:21:50 UTC - Ali Ahmed: ok
----
2020-08-05 01:22:29 UTC - Brandon Whitehead: so how do I check the actual message being sent?
----
2020-08-05 01:22:36 UTC - Brandon Whitehead: do I need a consumer too?
----
2020-08-05 01:22:56 UTC - Ali Ahmed: that’s the simplest way
----
2020-08-05 01:23:08 UTC - Ali Ahmed: or you can check the topic stats to see number of messages in it
----
2020-08-05 01:23:30 UTC - Brandon Whitehead: how do you check the topic stats?
----
2020-08-05 01:23:41 UTC - Brandon Whitehead: is it an extension of the 6650?
----
2020-08-05 01:25:04 UTC - Brandon Whitehead: this is my last two lines of output from the go program
----
2020-08-05 01:25:09 UTC - Brandon Whitehead: ```time="2020-08-04T21:21:17-04:00" level=info msg="Connected consumer" consumerID=1 name=kstrh subscription=my-sub topic="<persistent://public/default/topic-1>"
time="2020-08-04T21:21:17-04:00" level=info msg="Created consumer" consumerID=1 name=kstrh subscription=my-sub topic="<persistent://public/default/topic-1>"```
----
2020-08-05 01:25:27 UTC - Ali Ahmed: simplest way to use the pulsar-admin
```pulsar-admin topics stat {Full_TOPIC_NAME}```
----
2020-08-05 01:25:57 UTC - Brandon Whitehead: do you do that command in bash?
----
2020-08-05 01:26:02 UTC - Ali Ahmed: the log lines are just the topic exists and a consumer has been created
----
2020-08-05 01:26:19 UTC - Ali Ahmed: yes it’s unix script
----
2020-08-05 01:26:24 UTC - Brandon Whitehead: okay
bug : Ermir Zaimi
----
2020-08-05 01:26:31 UTC - Brandon Whitehead: bash for me isn't working right
----
2020-08-05 01:26:43 UTC - Brandon Whitehead: firstly I can't do pulsar commands in powershell
----
2020-08-05 01:26:58 UTC - Brandon Whitehead: but I have absolutely no directory when I use bash
----
2020-08-05 01:27:03 UTC - Ali Ahmed: try this instead <https://github.com/streamnative/pulsarctl>
----
2020-08-05 01:27:13 UTC - Ali Ahmed: it should work on windows
----
2020-08-05 01:27:14 UTC - Brandon Whitehead: as in when I do the "ls" command, it returns nothing
----
2020-08-05 02:54:18 UTC - Rattanjot Singh: I am trying <http://pulsar.apache.org/docs/en/kubernetes-helm/> but while deploying i getting imagePullBackOff.
----
2020-08-05 07:37:26 UTC - Daniel Ciocirlan: hey, short question if anyone tried this, if we want to have some edges for our application with minimal infrastructure, is it a concern if we only have consumers connected to main regions where we actually have a pulsar clusters ? These regions only consume, they do not produce anything.
----
2020-08-05 08:36:37 UTC - Vil: pulsar functions question: In the `received()` method of `PulsarSource`, why is a RuntimeException thrown on failure, if configured mode is effectively once?
<https://github.com/apache/pulsar/blob/835f506161bf2a29dee2c7944f14fdc1c2dbb554/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/source/PulsarSource.java#L135>

In org.apache.pulsar.client.api.MessageListener#received() I could find text “Application is responsible of handling any exception that could be thrown while processing the message”. But now I am confuse what I must do when using effectively once with pulsar functions. I expected it would just work after I set the configuration?  Or will my function really crash on failure?

(sorry if this is wrong channel)
----