You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2019/12/20 17:36:19 UTC
[GitHub] [pulsar] candlerb opened a new issue #5908: [doc] Document which
exceptions can be returned by API calls
candlerb opened a new issue #5908: [doc] Document which exceptions can be returned by API calls
URL: https://github.com/apache/pulsar/issues/5908
**Is your feature request related to a problem? Please describe.**
Looking at the python API: currently it's undocumented which calls can return exceptions, and under what conditions. This means that user has to experiment to find out the behaviour.
For example, take the following producer code:
```
import pulsar
import time
client = pulsar.Client('pulsar://localhost:6650')
producer = client.create_producer('my-topic', producer_name='fred', send_timeout_millis=0)
for i in range(10):
print("Sending %d" % i)
producer.send(('Hello-%d' % i).encode('utf-8'))
print("Sent %d" % i)
time.sleep(3)
client.close()
```
1. What happens if the broker is down, at the time of connection?
2. What happens if the broker is down, at the time of message publication?
Answers by experimentation:
1. `pulsar.Client('pulsar://localhost:6650')` raises `Pulsar error: ConnectError` if the broker is not accessible at that time. However:
2. `producer.send` does NOT raise an exception if the broker is down. Rather, in the background, reconnection attempts take place. Debug logs show this:
```
2019-12-20 16:44:33.016 INFO HandlerBase:129 | [persistent://public/default/my-topic, fred] Schedule reconnection in 0.1 s
2019-12-20 16:44:33.092 INFO ClientConnection:1337 | [127.0.0.1:37290 -> 127.0.0.1:6650] Connection closed
2019-12-20 16:44:33.093 INFO ClientConnection:229 | [127.0.0.1:37290 -> 127.0.0.1:6650] Destroyed connection
2019-12-20 16:44:33.093 INFO ClientConnection:1337 | [127.0.0.1:37300 -> 127.0.0.1:6650] Connection closed
2019-12-20 16:44:33.094 INFO ClientConnection:229 | [127.0.0.1:37300 -> 127.0.0.1:6650] Destroyed connection
2019-12-20 16:44:33.116 INFO HandlerBase:52 | [persistent://public/default/my-topic, fred] Getting connection from pool
2019-12-20 16:44:33.117 INFO ConnectionPool:62 | Deleting stale connection from pool for pulsar://localhost:6650 use_count: -1 @ 0
2019-12-20 16:44:33.117 INFO ConnectionPool:72 | Created connection for pulsar://localhost:6650
2019-12-20 16:44:33.124 ERROR ClientConnection:374 | [<none> -> pulsar://localhost:6650] Failed to establish connection: Connection refused
2019-12-20 16:44:33.124 INFO ClientConnection:1337 | [<none> -> pulsar://localhost:6650] Connection closed
... at increasing intervals
```
Since this is `send` rather than `send_async`, it blocks until the broker comes back up.
Then: suppose I set `send_timeout_millis=10000`. What happens if the message can't be sent within that time? (Answer by experiment: `producer.send` raises `Pulsar error: TimeOut`) What about with `send_async`? (Answer: callback function is invoked with `_pulsar.Result.Timeout`).
Are there any other situations in which `send` or `send_async` can raise an exception? I don't know. Therefore I don't know what I might have to catch.
**Describe the solution you'd like**
Each API method to have its semantics documented, including the exceptions it may raise.
In the specific example above: the documentation for [pulsar.Client.create_producer](https://pulsar.apache.org/api/python/#pulsar.Client.create_producer) should indicate which exception is raised if the broker is not reachable; [pulsar.Producer.send](https://pulsar.apache.org/api/python/#pulsar.Producer.send) that it *won't* raise an exception if the broker is down, but can raise a timeout exception if the message could not be delivered within the send timeout.
Knowing the API contract is crucial to writing robust applications. Documenting the behaviour is also a safety net against this behaviour being changed unexpectedly.
**Describe alternatives you've considered**
Trial and error.
**Additional context**
For comparison, the confluent_kafka [documentation](https://docs.confluent.io/current/clients/confluent-kafka-python/index.html#confluent_kafka.Producer.produce) includes a "Raises:" section under each call which can return an exception, stating which exceptions might occur.
Incidentally, the connect behaviour described above is different to Kafka. With the confluent_kakfa library, the Producer can be created even when the broker is down, and the program will attempt to connect in the background.
```
from confluent_kafka import Producer
import time
producer = Producer({"bootstrap.servers": "localhost"}) # NO EXCEPTION if broker is down
def delivery_report(err, msg):
print("%r %r" % (err,msg))
for i in range(10):
producer.poll(0)
print("Sending %d" % i)
producer.produce('my-topic', ("Hello-%d" % i).encode('utf-8'), callback=delivery_report)
print("Sent %d" % i)
time.sleep(3)
producer.flush()
```
Arguably this is more consistent than Pulsar: Kafka API will always perform connections for you, but Pulsar API in some situations will (re)connect in the background, and in other situations will fail. This means it's up to the user to implement backoff and reconnection strategies.
However, the exact behaviour is less important than documenting what it is, since at least then the user knows what is expected of them.
The Kafka documentation isn't perfect either: it doesn't say explicitly whether the callback is invoked in the same thread as the caller (answer: it is, unlike Pulsar). But it does say that the callback will be called from [Producer.poll](https://docs.confluent.io/current/clients/confluent-kafka-python/index.html#confluent_kafka.Producer.poll), which implies it happens in the same thread of execution.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
[GitHub] [pulsar] jiazhai commented on issue #5908: [doc] Document which
exceptions can be returned by API calls
Posted by GitBox <gi...@apache.org>.
jiazhai commented on issue #5908: [doc] Document which exceptions can be returned by API calls
URL: https://github.com/apache/pulsar/issues/5908#issuecomment-568338088
Thanks @candlerb for this issue.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services