You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pulsar.apache.org by Apache Pulsar Slack <ap...@gmail.com> on 2018/08/22 09:11:04 UTC

Slack digest for #general - 2018-08-22

2018-08-21 09:19:09 UTC - Bharadwaj R Embar: @Bharadwaj R Embar has joined the channel
----
2018-08-21 13:33:41 UTC - George Wilk: @George Wilk has joined the channel
----
2018-08-21 13:36:22 UTC - George Wilk: We are creating new unique subscription names in course of CI pipeline testing.  Is there any downside to creating lots of subscriptions?  If so, is there an API that would allow us to clean them up once they're no longer needed?
----
2018-08-21 13:44:45 UTC - Mate Varga: We are having the same issue. Actually it'd be 'cleaner' to have a (fast) full state reset API.
----
2018-08-21 15:52:03 UTC - Sanjeev Kulkarni: @George Wilk are these subscriptions for a few topics? Or the number of topics are similar to no of subscriptions?
----
2018-08-21 15:58:52 UTC - George Wilk: @Sanjeev Kulkarni we have about ~20 topics being published to from a Java client, and subscribers are python clients executing during CI pipeline func testing.  Python pulsar-client 2.1.0.  We've seen issues with client connections when attempting to re-use the subscriptions in test, this despite closing subscriptions and client connections on teardown. Since introducing timestamp-based subscription names each time the test runs we've not seen any connectivity issues.  This however creates new set of subscriptions to each topic every time we run the CI pipeline.
----
2018-08-21 16:02:51 UTC - Sanjeev Kulkarni: What issues have you seen wrt python clients?  IIUC when you close one python client and try to instantiate another client wit the same subscription, it refuses to connect? Is there a error message?
----
2018-08-21 16:06:54 UTC - George Wilk: We've seen this behavior when closing python client on teardown of one test, and then on subsequent runs opening the python client and trying to create the same subscription.  Just to clarify - not during same test execution.  Here is the error message:

system:9 Bad file descriptor

This happens when connection goes down between the time message is received and ACK is sent, so it is related to closed socket connection while sending ACK.

To be sure, we are using shared subscriptions i.e. consumer_type=ConsumerType.Shared.
----
2018-08-21 16:13:00 UTC - George Wilk: Just realized that I did not answer your question:  subscriptions are opened 1 per topic.  We were unable to use wildcard to subscribe to multiple topics using single subscription.
----
2018-08-21 16:35:01 UTC - George Wilk: @Sanjeev Kulkarni Another error we've been seeing in the same scenario is:

ERROR ClientConnection:493 | [111.68.22.150:49971 -&gt; 100.65.86.171:6651] [consumer id 17, message ledger id 2204, entry id 256] Error parsing message metadata

[libprotobuf ERROR google/protobuf/message_lite.cc:123] Can't parse message of type "pulsar.proto.MessageMetadata" because it is missing required fields: (cannot determine missing fields for lite message)
----
2018-08-21 17:18:34 UTC - Matteo Merli: @George Wilk The behavior in C++ (and Python which is based on it) is to give error when acknowledging a message and the connection is not available. 

We will improve this behavior (like already did for Java). For now, you can “ignore” the exceptions on acknowledge, since the messages will be resent in any case (if the ack failed).
----
2018-08-21 17:24:52 UTC - George Wilk: @Matteo Merli Thanks for your reply.  We've been handling these exceptions by ignoring them on ACK, but that still leaves the other two errors we've been seeing on receive():

ERROR ClientConnection:493 | [111.68.22.150:49971 -&gt; 100.65.86.171:6651] [consumer id 17, message ledger id 2204, entry id 256] Error parsing message metadata

[libprotobuf ERROR google/protobuf/message_lite.cc:123] Can't parse message of type "pulsar.proto.MessageMetadata" because it is missing required fields: (cannot determine missing fields for lite message)
----
2018-08-21 17:26:21 UTC - Matteo Merli: @George Wilk There are multiple ways to delete existing subscriptions. eg: 

 * `bin/pulsar-admin topics unsubscribe $MY_TOPIC -s $MY_SUBSCRIPTION` 
 * Or just delete the topic (and use `--force ` if it’s being used) : `bin/pulsar-admin topics delete --force  $MY_TOPIC`
----
2018-08-21 17:27:16 UTC - Matteo Merli: Can you check the version of the Python client library?
----
2018-08-21 17:27:35 UTC - George Wilk: :+1: great!
----
2018-08-21 17:28:09 UTC - Matteo Merli: These are available also from REST / Java admin API
----
2018-08-21 17:31:55 UTC - Matteo Merli: Is that different from Java producers version? (not that it should be failing in any case).
----
2018-08-21 17:32:17 UTC - Matteo Merli: Would you be able to provide a tcpdump taken from consumer?
----
2018-08-21 18:06:01 UTC - George Wilk: python client lib is 2.1.0 and Java client is 2.0.1-incubating.  It is an intermittent failure though.. so I would not suspect version incompatibility as culprit in this case, but point well taken - we should upgrade.
----
2018-08-21 18:07:12 UTC - Matteo Merli: ok, I was just trying to seek possible causes
----
2018-08-21 18:07:30 UTC - Matteo Merli: any way to reliably reproduce this issue?
----
2018-08-21 18:07:59 UTC - Matteo Merli: does it happen after the connection lost?
----
2018-08-21 23:56:07 UTC - Ravi: 2018-08-21 23:55:23.077 INFO  ConnectionPool:63 | Created connection for <pulsar://192.168.1.33:6050>
2018-08-21 23:55:23.078 ERROR ClientConnection:329 | [&lt;none&gt; -&gt; <pulsar://192.168.1.33:6050>] Failed to establish connection: Connection refused
2018-08-21 23:55:23.078 INFO  ClientConnection:1182 | [&lt;none&gt; -&gt; <pulsar://192.168.1.33:6050>] Connection closed
2018-08-21 23:55:23.078 ERROR ClientImpl:267 | Error Checking/Getting Partition Metadata while Subscribing- 5
2018-08-21 23:55:23.078 INFO  ClientConnection:195 | [&lt;none&gt; -&gt; <pulsar://192.168.1.33:6050>] Destroyed connection
Traceback (most recent call last):
  File "consumer.py", line 26, in &lt;module&gt;
    main(sys.argv)
  File "consumer.py", line 13, in main
    consumer = client.subscribe(WORD_COUNT_TOPIC, SUBSCRIPTION)
  File "/usr/local/lib/python2.7/site-packages/pulsar/__init__.py", line 470, in subscribe
    c._consumer = self._client.subscribe(topic, subscription_name, conf)
Exception: Pulsar error: ConnectError
$ docker ps
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS                                            NAMES
77a650c08729        streamlio/sandbox   "supervisord -n"    6 minutes ago       Up 6 minutes        0.0.0.0:6050-&gt;6050/tcp, 0.0.0.0:8010-&gt;8010/tcp   docker_Streamlio_1
----
2018-08-21 23:56:17 UTC - Ravi: @Matteo Merli can you tell me what the issue is?
----
2018-08-22 00:01:04 UTC - Matteo Merli: @Ravi The client seems not be able to connect to `192.168.1.33:6050`
----
2018-08-22 00:01:30 UTC - Ravi: it is in the same host
----
2018-08-22 00:01:40 UTC - Ravi: thats the host ip
----
2018-08-22 00:01:57 UTC - Ravi: $ docker ps
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS                                            NAMES
77a650c08729        streamlio/sandbox   "supervisord -n"    6 minutes ago       Up 6 minutes        0.0.0.0:6050-&gt;6050/tcp, 0.0.0.0:8010-&gt;8010/tcp   docker_Streamlio_1
$ hostname -i
192.168.1.33
----
2018-08-22 00:02:11 UTC - Matteo Merli: what if you pass `127.0.0.1:6650` as service url?
----
2018-08-22 00:02:27 UTC - Ravi: you mean 127.0.0.1:6050
----
2018-08-22 00:02:32 UTC - Ravi: i changed the ports
----
2018-08-22 00:02:40 UTC - Matteo Merli: sure
----
2018-08-22 00:03:37 UTC - Ravi: same
----
2018-08-22 00:03:38 UTC - Ravi: $ python consumer.py
2018-08-22 00:03:16.004 INFO  ConnectionPool:63 | Created connection for <pulsar://127.0.0.1:6050>
2018-08-22 00:03:16.005 INFO  ClientConnection:285 | [127.0.0.1:60228 -&gt; 127.0.0.1:6050] Connected to broker
2018-08-22 00:03:16.005 INFO  ClientConnection:1182 | [127.0.0.1:60228 -&gt; 127.0.0.1:6050] Connection closed
2018-08-22 00:03:16.005 ERROR ClientImpl:267 | Error Checking/Getting Partition Metadata while Subscribing- 5
2018-08-22 00:03:16.005 INFO  ClientConnection:195 | [127.0.0.1:60228 -&gt; 127.0.0.1:6050] Destroyed connection
Traceback (most recent call last):
----
2018-08-22 00:04:22 UTC - Matteo Merli: is the docker container running on a laptop?
----
2018-08-22 00:04:34 UTC - Matteo Merli: can you try to restart the container?
----
2018-08-22 00:05:16 UTC - Ravi: no its in aws
----
2018-08-22 00:05:22 UTC - Ravi: i used a compose file to build
----
2018-08-22 00:05:41 UTC - Ravi: version: '3'
services:
 Streamlio:
    image: "streamlio/sandbox"
    volumes:
      - /appvol/streamlio/data:/pulsar/data
    ports:
      - "6050:6050"
      - "8010:8010"
    networks:
      - streamsnet
networks:
  streamsnet:
    driver: bridge
----
2018-08-22 00:07:09 UTC - Matteo Merli: ok, can you try `docker-compose restart Streamlio`, just in case?
----
2018-08-22 00:07:09 UTC - Ravi: restarted
----
2018-08-22 00:07:38 UTC - Ravi: no use
----
2018-08-22 00:08:17 UTC - Ravi: $ netstat -tupln|grep 6050
(Not all processes could be identified, non-owned process info
 will not be shown, you would have to be root to see it all.)
tcp6       0      0 :::6050                 :::*                    LISTEN      -
$ netstat -tupln|grep 8010
(Not all processes could be identified, non-owned process info
 will not be shown, you would have to be root to see it all.)
tcp6       0      0 :::8010                 :::*                    LISTEN      -
----
2018-08-22 00:08:40 UTC - Matteo Merli: so process is running inside container, right?
----
2018-08-22 00:09:17 UTC - Ravi: ?
----
2018-08-22 00:09:35 UTC - Ravi: i m running python consumer.py
----
2018-08-22 00:09:41 UTC - Ravi: it worked in local
----
2018-08-22 00:09:44 UTC - Ravi: not in aws
----
2018-08-22 00:10:01 UTC - Matteo Merli: I mean, the Pulsar broker that runs inside the Docker container
----
2018-08-22 00:10:25 UTC - Ravi: it should , i just used your image
----
2018-08-22 00:10:39 UTC - Ravi: i sent u the compose file as well
----
2018-08-22 00:10:52 UTC - Matteo Merli: can you go inside: 

`docker exec -it $CONTAINER_ID bash` then `ps aux | grep java`
----
2018-08-22 00:10:55 UTC - Ravi: you can run it removing the volume
----
2018-08-22 00:12:02 UTC - Matteo Merli: Could it be something related to the `networks` section in the compose file?
----
2018-08-22 00:12:46 UTC - Ravi: root@77a650c08729:/scripts# ps -ef|grep java
root         9     1  1 00:07 ?        00:00:05 /docker-java-home/jre/bin/java -cp /pulsar/conf:::/pulsar/lib/*: -Dlog4j.configurationFile=log4j2.yaml -Djute.maxbuffer=10485760 -Djava.net.preferIPv4Stack=true -Xmx512M -XX:MaxDirectMemorySize=512M -XX:+UseG1GC -Dio.netty.leakDetectionLevel=disabled -Dio.netty.recycler.maxCapacity.default=1000 -Dio.netty.recycler.linkCapacity=1024 -Dpulsar.log.appender=RoutingAppender -Dpulsar.log.dir=/pulsar/logs -Dpulsar.log.level=info -Dpulsar.routing.appender.default=Console -Dpulsar.functions.process.container.log.dir=/pulsar/logs -Dpulsar.functions.java.instance.jar=/pulsar/instances/java-instance.jar -Dpulsar.functions.python.instance.file=/pulsar/instances/python-instance/python_instance_main.py -Dzookeeper.4lw.commands.whitelist=* -Dpulsar.log.file=pulsar-standalone.log org.apache.pulsar.PulsarStandaloneStarter --config /pulsar/conf/standalone.conf --advertised-address 127.0.0.1
----
2018-08-22 00:12:58 UTC - Matteo Merli: ok, looks fine there
----
2018-08-22 00:13:02 UTC - Matteo Merli: Can you try to run without compose?
----
2018-08-22 00:13:15 UTC - Ravi: yep if it was the compose file it will error off right away
----
2018-08-22 00:14:01 UTC - Matteo Merli: Uhm, I think the problem is that the process inside is listening on 6650
----
2018-08-22 00:14:15 UTC - Matteo Merli: but you’re opening 6050
----
2018-08-22 00:14:25 UTC - Ravi: how is it possible
----
2018-08-22 00:14:37 UTC - Ravi: i changed the consumer.py also to listen on 6050
----
2018-08-22 00:14:51 UTC - Ravi: def main(args):
    # Create a pulsar client instance with reference to the broker
    client = pulsar.Client('<pulsar://localhost:6050>')
----
2018-08-22 00:15:14 UTC - Matteo Merli: yes, but the Pulsar Java process inside the container is using port 6650
----
2018-08-22 00:15:27 UTC - Ravi: ok let me try putting it back to 6650
----
2018-08-22 00:18:50 UTC - Ravi: $ python consumer.py
2018-08-22 00:18:37.069 INFO  ConnectionPool:63 | Created connection for <pulsar://localhost:6650>
2018-08-22 00:18:37.070 INFO  ClientConnection:285 | [127.0.0.1:44880 -&gt; 127.0.0.1:6650] Connected to broker
2018-08-22 00:18:37.071 INFO  ClientConnection:1182 | [127.0.0.1:44880 -&gt; 127.0.0.1:6650] Connection closed
2018-08-22 00:18:37.071 ERROR ClientImpl:267 | Error Checking/Getting Partition Metadata while Subscribing- 5
2018-08-22 00:18:37.071 INFO  ClientConnection:195 | [127.0.0.1:44880 -&gt; 127.0.0.1:6650] Destroyed connection
Traceback (most recent call last):
  File "consumer.py", line 26, in &lt;module&gt;
    main(sys.argv)
  File "consumer.py", line 13, in main
    consumer = client.subscribe(WORD_COUNT_TOPIC, SUBSCRIPTION)
----
2018-08-22 00:21:19 UTC - Matteo Merli: Uhm, ok :slightly_smiling_face:
----
2018-08-22 00:21:41 UTC - Matteo Merli: And what if you remove compose completely? 

```
docker run -d \
  --name streamlio-sandbox \
  -p 6650:6650 \
  -p 8080:8080 \
  -p 8000:8000 \
  streamlio/sandbox
```
----
2018-08-22 00:21:50 UTC - Ravi: :disappointed:
----
2018-08-22 00:22:27 UTC - Ravi: going to try
----
2018-08-22 00:22:35 UTC - Matteo Merli: I don’t know much about compose, but I’d suspect the `   driver: bridge` for the network
----
2018-08-22 00:23:52 UTC - Ravi: no use
----
2018-08-22 00:24:03 UTC - Ravi: no if driver was an issue it wont boot up,
----
2018-08-22 00:24:12 UTC - Ravi: i tried the cli run, no use
----
2018-08-22 00:25:05 UTC - Ravi: $ python consumer.py
2018-08-22 00:23:42.852 INFO  ConnectionPool:63 | Created connection for <pulsar://localhost:6650>
2018-08-22 00:23:42.852 INFO  ClientConnection:285 | [127.0.0.1:45218 -&gt; 127.0.0.1:6650] Connected to broker
2018-08-22 00:23:42.853 INFO  ClientConnection:1182 | [127.0.0.1:45218 -&gt; 127.0.0.1:6650] Connection closed
2018-08-22 00:23:42.853 ERROR ClientImpl:267 | Error Checking/Getting Partition Metadata while Subscribing- 5
2018-08-22 00:23:42.853 INFO  ClientConnection:195 | [127.0.0.1:45218 -&gt; 127.0.0.1:6650] Destroyed connection
Traceback (most recent call last):
  File "consumer.py", line 26, in &lt;module&gt;
    main(sys.argv)
----
2018-08-22 00:25:16 UTC - Matteo Merli: so, you said it was working locally but not the same thing on aws?
----
2018-08-22 00:26:03 UTC - Ravi: yes
----
2018-08-22 00:26:10 UTC - Ravi: i need it on aws
----
2018-08-22 00:26:19 UTC - Ravi: do u have a image on aws
----
2018-08-22 00:26:57 UTC - Ravi: do u want to escalate this ?
----
2018-08-22 00:29:11 UTC - Ravi: avis-MacBook-Pro:gcp ravihanzo$ python consumer.py
20:28:42.125 [0x7fffb1c93380] INFO  /Users/vagrant/incubator-pulsar/pulsar-client-cpp/lib/ConnectionPool.cc(63) - Created connection for <pulsar://localhost:6650>
20:28:42.128 [0x700002576000] INFO  /Users/vagrant/incubator-pulsar/pulsar-client-cpp/lib/ClientConnection.cc(279) - [[::1]:52680 -&gt; [::1]:6650] Connected to broker
20:28:42.151 [0x700002576000] INFO  /Users/vagrant/incubator-pulsar/pulsar-client-cpp/lib/HandlerBase.cc(53) - [<persistent://public/default/wordcount>, my-sub, 0] Getting connection from pool
20:28:42.192 [0x700002576000] INFO  /Users/vagrant/incubator-pulsar/pulsar-client-cpp/lib/ConnectionPool.cc(63) - Created connection for <pulsar://3d2ed6d0d4fe:6650>
20:28:42.193 [0x700002576000] INFO  /Users/vagrant/incubator-pulsar/pulsar-client-cpp/lib/ClientConnection.cc(281) - [[::1]:52681 -&gt; [::1]:6650] Connected to broker through proxy. Logical broker: <pulsar://3d2ed6d0d4fe:6650>
20:28:42.270 [0x700002576000] INFO  /Users/vagrant/incubator-pulsar/pulsar-client-cpp/lib/ConsumerImpl.cc(165) - [<persistent://public/default/wordcount>, my-sub, 0] Created consumer on broker [[::1]:52681 -&gt; [::1]:6650]
----
2018-08-22 00:29:15 UTC - Ravi: works fine on my mac
----
2018-08-22 00:30:24 UTC - Matteo Merli: Ok, can you check broker logs inside container?
----
2018-08-22 00:31:07 UTC - Matteo Merli: `docker exec -it $CONTAINER_ID bash` and then 

`cd /var/log/supervisor` 

You would have something like `pulsar-stdout---supervisor-PQhXSV.log`
----
2018-08-22 00:31:14 UTC - Ravi: let me see
----
2018-08-22 00:32:03 UTC - Ravi: no such file
----
2018-08-22 00:32:22 UTC - Matteo Merli: I mean, with similar name
----
2018-08-22 00:32:34 UTC - Matteo Merli: `pulsar-stdout---supervisor-*.log`
----
2018-08-22 00:33:20 UTC - Ravi: there is no file
----
2018-08-22 00:33:41 UTC - Ravi: not on my local, not in the aws
----
2018-08-22 00:33:44 UTC - Ravi: i checked both
----
2018-08-22 00:33:46 UTC - Ravi: root@3d2ed6d0d4fe:/var/log# ls -l
total 200
-rw-r--r-- 1 root root  22864 Jun 19 07:59 alternatives.log
drwxr-xr-x 1 root root   4096 Jun 19 07:59 apt
-rw-rw---- 1 root utmp      0 Apr 26 00:00 btmp
-rw-r--r-- 1 root root 134836 Jun 19 07:59 dpkg.log
-rw-r--r-- 1 root root   3232 Apr 26 00:00 faillog
-rw-r--r-- 1 root root    751 Jun  6 18:44 fontconfig.log
-rw-rw-r-- 1 root utmp  29492 Apr 26 00:00 lastlog
-rw-rw-r-- 1 root utmp      0 Apr 26 00:00 wtmp
----
2018-08-22 00:33:51 UTC - Ravi: this is from my local
----
2018-08-22 00:34:06 UTC - Matteo Merli: `cd /var/log/supervisor`
----
2018-08-22 00:34:13 UTC - Ravi: this is from aws
----
2018-08-22 00:34:15 UTC - Ravi: root@212eee8db557:/var/log# ls -l
total 200
-rw-r--r-- 1 root root  22864 Jun 19 07:59 alternatives.log
drwxr-xr-x 1 root root   4096 Jun 19 07:59 apt
-rw-rw---- 1 root utmp      0 Apr 26 00:00 btmp
-rw-r--r-- 1 root root 134836 Jun 19 07:59 dpkg.log
-rw-r--r-- 1 root root   3232 Apr 26 00:00 faillog
-rw-r--r-- 1 root root    751 Jun  6 18:44 fontconfig.log
-rw-rw-r-- 1 root utmp  29492 Apr 26 00:00 lastlog
-rw-rw-r-- 1 root utmp      0 Apr 26 00:00 wtmp
----
2018-08-22 00:34:24 UTC - Ravi: no supervisor on the list
----
2018-08-22 00:34:33 UTC - Ravi: there is only apt as a directory
----
2018-08-22 00:35:28 UTC - Matteo Merli: there must be :confused: Since there are few processes that are running under supervisor watchdog
----
2018-08-22 00:36:08 UTC - Ravi: not on my local
----
2018-08-22 00:36:12 UTC - Ravi: not on my aws
----
2018-08-22 00:36:15 UTC - Ravi: i checked both
----
2018-08-22 00:36:45 UTC - Matteo Merli: I just tried to run `docker run -it  --name streamlio-sandbox   -p 6650:6650   -p 8080:8080   -p 8000:8000` and it was there on mine
----
2018-08-22 00:36:46 UTC - Ravi: should i start using pulsar instead of <http://streaml.io|streaml.io> ?
----
2018-08-22 00:37:14 UTC - Matteo Merli: it’s mostly the same, there are some additional components in the streamlio image
----
2018-08-22 00:37:15 UTC - Ravi: the only diff i see is volume mapping is different
----
2018-08-22 00:37:24 UTC - Ravi: and ports are different
----
2018-08-22 00:42:01 UTC - Ravi: retrying it
----
2018-08-22 00:42:13 UTC - Ravi: $ docker run -d \
&gt;   --name streamlio-sandbox \
&gt;   -p 6650:6650 \
&gt;   -p 8010:8010 \
&gt;   -p 8001:8001 \
&gt;   streamlio/sandbox
----
2018-08-22 00:43:04 UTC - Ravi: ok seems to be there
----
2018-08-22 00:43:51 UTC - Ravi: ok this worked finally
----
2018-08-22 00:44:02 UTC - Ravi: now need to understand why it was failing before
----
2018-08-22 00:44:29 UTC - Ravi: going to redo this via compose
----
2018-08-22 00:45:54 UTC - Ravi: question - do you know where the volumes are mapped in streamlio sandbox
----
2018-08-22 00:48:37 UTC - Ravi: Does anyone know what is the volume mapping in Streamlio docker image ? pulsar user /pulsar/data
----
2018-08-22 01:00:38 UTC - Matteo Merli: Sorry, had to leave. I'll check it in a bit 
----
2018-08-22 01:39:16 UTC - Matteo Merli: @Ravi by default it will be writing everything under `/pulsar/data`. You can mount that directory from outside volume to avoid loosing the data
----
2018-08-22 03:56:32 UTC - Ali Ahmed: if you need to map volumes you can do as so
```
docker run -it \
  -p 6650:6650 \
  -p 8080:8080 \
  -v $PWD/data:/pulsar/data \
  apachepulsar/pulsar:latest \
  bin/pulsar standalone
```
----
2018-08-22 04:00:12 UTC - Ravi: apt-get install vim
----
2018-08-22 04:00:16 UTC - Ravi: sorry
----
2018-08-22 04:00:38 UTC - Ravi: INFO success: pulsar-ui entered RUNNING state, process has stayed up for &gt; than 1 seconds (startsecs)
Streamlio_1  | 2018-08-22 03:54:08,497 INFO exited: function-submit (exit status 1; not expected)
----
2018-08-22 04:00:43 UTC - Ravi: thats the error i get
----