You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2019/10/31 13:05:08 UTC
[GitHub] [pulsar] candlerb opened a new issue #5521: Python exception
message not shown when python function fails to start
candlerb opened a new issue #5521: Python exception message not shown when python function fails to start
URL: https://github.com/apache/pulsar/issues/5521
**Describe the bug**
I have a problem starting a python function. However, the actual python error message (reported on stderr) is being lost. I can only see it if I "strace" the entire pulsar stack.
**To Reproduce**
I am using pulsar functions with pulsar 2.4.1 in standalone mode, base OS is Ubuntu 18.04, and I've done the frig to link "python" to "python3":
```
sudo update-alternatives --install /usr/bin/python python /usr/bin/python3 10
```
Here's my function:
```
$ cat func1.py
from pulsar import Function
class FirstFunction(Function):
def process(self, item, context):
log = context.get_logger()
log.info("Got %r with properties %r" % (item, context.get_message_properties()))
```
I can run it from the command line and it just returns without error.
Deployed:
```
$ apache-pulsar-2.4.1/bin/pulsar-admin functions create --name womble --inputs my-topic --py /home/ubuntu/func1.py --classname FirstFunction --log-topic pulsar-log
$ apache-pulsar-2.4.1/bin/pulsar-admin functions status --name womble
{
"numInstances" : 1,
"numRunning" : 0,
"instances" : [ {
"instanceId" : 0,
"status" : {
"running" : false,
"error" : "",
"numRestarts" : 222,
"numReceived" : 0,
"numSuccessfullyProcessed" : 0,
"numUserExceptions" : 0,
"latestUserExceptions" : [ ],
"numSystemExceptions" : 0,
"latestSystemExceptions" : [ ],
"averageLatency" : 0.0,
"lastInvocationTime" : 0,
"workerId" : "c-standalone-fw-ldex-pulsar.int.example.net-8080"
}
} ]
}
```
Looking at the pulsar process output, I see it restarting every 30 seconds but no indication of why it's dying. There's a "DeathException" but no further info - not even the process exit code as fas as I can see.
```
12:39:45.003 [function-timer-thread-94-1] ERROR org.apache.pulsar.functions.runtime.RuntimeSpawner - public/default/womble-java.lang.RuntimeException: Function
Container is dead with exception.. restarting
12:39:45.003 [function-timer-thread-94-1] INFO org.apache.pulsar.functions.runtime.ProcessRuntime - Creating function log directory /home/ubuntu/apache-pulsar-2
.4.1/logs/functions/public/default/womble
12:39:45.003 [function-timer-thread-94-1] INFO org.apache.pulsar.functions.runtime.ProcessRuntime - Created or found function log directory /home/ubuntu/apache-
pulsar-2.4.1/logs/functions/public/default/womble
12:39:45.003 [function-timer-thread-94-1] INFO org.apache.pulsar.functions.runtime.ProcessRuntime - ProcessBuilder starting the process with args python /home/u
buntu/apache-pulsar-2.4.1/instances/python-instance/python_instance_main.py --py /tmp/pulsar_functions/public/default/womble/0/func1.py --logging_directory /home
/ubuntu/apache-pulsar-2.4.1/logs/functions --logging_file womble --logging_config_file /home/ubuntu/apache-pulsar-2.4.1/conf/functions-logging/logging_config.ini
--instance_id 0 --function_id 67fb9828-56ab-49ff-bbbc-56db9510d26c --function_version 4f01d696-28da-40b7-b9f5-2283fbe9bd7c --function_details '{"tenant":"public
","namespace":"default","name":"womble","className":"FirstFunction","logTopic":"pulsar-log","runtime":"PYTHON","autoAck":true,"parallelism":1,"source":{"inputSpe
cs":{"my-topic":{}},"cleanupSubscription":true},"sink":{},"resources":{"cpu":1.0,"ram":"1073741824","disk":"10737418240"},"componentType":"FUNCTION"}' --pulsar_s
erviceurl pulsar://ldex-pulsar.int.example.net:6650 --max_buffered_tuples 1024 --port 38423 --metrics_port 36591 --state_storage_serviceurl bk://127.0.0.1:418
1 --expected_healthcheck_interval 30 --secrets_provider secretsprovider.ClearTextSecretsProvider --cluster_name standalone
12:39:45.005 [function-timer-thread-94-1] INFO org.apache.pulsar.functions.runtime.ProcessRuntime - Started process successfully
2019-10-31 12:39:45.478 INFO ConnectionPool:72 | Created connection for pulsar://ldex-pulsar.int.example.net:6650
2019-10-31 12:39:45.480 INFO ClientConnection:324 | [10.85.179.103:37476 -> 10.85.179.103:6650] Connected to broker
12:39:45.480 [pulsar-io-50-6] INFO org.apache.pulsar.broker.service.ServerCnx - New connection from /10.85.179.103:37476
2019-10-31 12:39:45.481 INFO BatchMessageContainer:43 | { BatchContainer [size = 0] [batchSizeInBytes_ = 0] [maxAllowedMessageBatchSizeInBytes_ = 131072] [maxAl
lowedNumMessagesInBatch_ = 1000] [topicName = persistent://public/default/pulsar-log] [producerName_ = ] [batchSizeInBytes_ = 0] [numberOfBatchesSent = 0] [avera
geBatchSize = 0]} BatchMessageContainer constructed
2019-10-31 12:39:45.481 INFO HandlerBase:52 | [persistent://public/default/pulsar-log, ] Getting connection from pool
12:39:45.482 [pulsar-io-50-6] INFO org.apache.pulsar.broker.service.ServerCnx - [/10.85.179.103:37476][persistent://public/default/pulsar-log] Creating producer
. producerId=0
12:39:45.483 [ForkJoinPool.commonPool-worker-1] INFO org.apache.pulsar.broker.service.ServerCnx - [/10.85.179.103:37476]-0 persistent://public/default/pulsar-lo
g configured with schema false
12:39:45.483 [ForkJoinPool.commonPool-worker-1] INFO org.apache.pulsar.broker.service.ServerCnx - [/10.85.179.103:37476] Created new producer: Producer{topic=Pe
rsistentTopic{topic=persistent://public/default/pulsar-log}, client=/10.85.179.103:37476, producerName=standalone-1-142, producerId=0}
2019-10-31 12:39:45.483 INFO ProducerImpl:155 | [persistent://public/default/pulsar-log, ] Created producer on broker [10.85.179.103:37476 -> 10.85.179.103:6650
]
2019-10-31 12:39:45.504 INFO Client:88 | Subscribing on Topic :my-topic
2019-10-31 12:39:45.505 INFO HandlerBase:52 | [persistent://public/default/my-topic, public/default/womble, 0] Getting connection from pool
12:39:45.505 [pulsar-io-50-6] INFO org.apache.pulsar.broker.service.ServerCnx - [/10.85.179.103:37476] Subscribing on topic persistent://public/default/my-topic
/ public/default/womble
12:39:45.505 [pulsar-io-50-6] INFO org.apache.bookkeeper.mledger.impl.ManagedCursorImpl - [public/default/persistent/my-topic-public%2Fdefault%2Fwomble] Rewind
from 36531:29 to 36531:9
12:39:45.506 [pulsar-io-50-6] INFO org.apache.pulsar.broker.service.persistent.PersistentTopic - [persistent://public/default/my-topic] There are no replicated
subscriptions on the topic
12:39:45.506 [pulsar-io-50-6] INFO org.apache.pulsar.broker.service.persistent.PersistentTopic - [persistent://public/default/my-topic][public/default/womble] C
reated new subscription for 0
12:39:45.506 [pulsar-io-50-6] INFO org.apache.pulsar.broker.service.ServerCnx - [/10.85.179.103:37476] Created subscription on topic persistent://public/default
/my-topic / public/default/womble
2019-10-31 12:39:45.506 INFO ConsumerImpl:170 | [persistent://public/default/my-topic, public/default/womble, 0] Created consumer on broker [10.85.179.103:37476
-> 10.85.179.103:6650]
12:39:45.598 [pulsar-io-50-6] INFO org.apache.pulsar.broker.service.ServerCnx - Closed connection from /10.85.179.103:37476
12:39:45.598 [pulsar-io-50-6] INFO org.apache.pulsar.broker.service.persistent.PersistentDispatcherMultipleConsumers - Removed consumer Consumer{subscription=Pe
rsistentSubscription{topic=persistent://public/default/my-topic, name=public/default/womble}, consumerId=0, consumerName=f01655, address=/10.85.179.103:37476} with pending 20 acks
12:40:00.754 [pulsar-web-57-5] INFO org.eclipse.jetty.server.RequestLog - 10.85.179.103 - - [31/Oct/2019:12:40:00 +0000] "GET /admin/v2/persistent/public/functi
ons/coordinate/stats HTTP/1.1" 200 851 "-" "Pulsar-Java-v2.4.1" 2
12:40:15.002 [function-timer-thread-94-1] ERROR org.apache.pulsar.functions.runtime.ProcessRuntime - Extracted Process death exception
java.lang.RuntimeException:
at org.apache.pulsar.functions.runtime.ProcessRuntime.tryExtractingDeathException(ProcessRuntime.java:380) ~[org.apache.pulsar-pulsar-functions-runtime-2.4.1.jar:2.4.1]
at org.apache.pulsar.functions.runtime.ProcessRuntime.isAlive(ProcessRuntime.java:367) ~[org.apache.pulsar-pulsar-functions-runtime-2.4.1.jar:2.4.1]
at org.apache.pulsar.functions.runtime.RuntimeSpawner.lambda$start$0(RuntimeSpawner.java:88) ~[org.apache.pulsar-pulsar-functions-runtime-2.4.1.jar:2.4.1]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_222]
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [?:1.8.0_222]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_222]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [?:1.8.0_222]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_222]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_222]
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_222]
12:40:15.003 [function-timer-thread-94-1] ERROR org.apache.pulsar.functions.runtime.RuntimeSpawner - public/default/womble-java.lang.RuntimeException: Function Container is dead with exception.. restarting
12:40:15.003 [function-timer-thread-94-1] INFO org.apache.pulsar.functions.runtime.ProcessRuntime - Creating function log directory /home/ubuntu/apache-pulsar-2.4.1/logs/functions/public/default/womble
12:40:15.003 [function-timer-thread-94-1] INFO org.apache.pulsar.functions.runtime.ProcessRuntime - Created or found function log directory /home/ubuntu/apache-pulsar-2.4.1/logs/functions/public/default/womble
12:40:15.004 [function-timer-thread-94-1] INFO org.apache.pulsar.functions.runtime.ProcessRuntime - ProcessBuilder starting the process with args python /home/ubuntu/apache-pulsar-2.4.1/instances/python-instance/python_instance_main.py --py /tmp/pulsar_functions/public/default/womble/0/func1.py --logging_directory /home/ubuntu/apache-pulsar-2.4.1/logs/functions --logging_file womble --logging_config_file /home/ubuntu/apache-pulsar-2.4.1/conf/functions-logging/logging_config.ini --instance_id 0 --function_id 67fb9828-56ab-49ff-bbbc-56db9510d26c --function_version 4f01d696-28da-40b7-b9f5-2283fbe9bd7c --function_details '{"tenant":"public","namespace":"default","name":"womble","className":"FirstFunction","logTopic":"pulsar-log","runtime":"PYTHON","autoAck":true,"parallelism":1,"source":{"inputSpecs":{"my-topic":{}},"cleanupSubscription":true},"sink":{},"resources":{"cpu":1.0,"ram":"1073741824","disk":"10737418240"},"componentType":"FUNCTION"}' --pulsar_serviceurl pulsar://ldex-pulsar.int.example.net:6650 --max_buffered_tuples 1024 --port 38423 --metrics_port 36591 --state_storage_serviceurl bk://127.0.0.1:4181 --expected_healthcheck_interval 30 --secrets_provider secretsprovider.ClearTextSecretsProvider --cluster_name standalone
12:40:15.010 [function-timer-thread-94-1] INFO org.apache.pulsar.functions.runtime.ProcessRuntime - Started process successfully
2019-10-31 12:40:15.481 INFO ConnectionPool:72 | Created connection for pulsar://ldex-pulsar.int.example.net:6650
2019-10-31 12:40:15.483 INFO ClientConnection:324 | [10.85.179.103:37928 -> 10.85.179.103:6650] Connected to broker
12:40:15.484 [pulsar-io-50-7] INFO org.apache.pulsar.broker.service.ServerCnx - New connection from /10.85.179.103:37928
2019-10-31 12:40:15.485 INFO BatchMessageContainer:43 | { BatchContainer [size = 0] [batchSizeInBytes_ = 0] [maxAllowedMessageBatchSizeInBytes_ = 131072] [maxAllowedNumMessagesInBatch_ = 1000] [topicName = persistent://public/default/pulsar-log] [producerName_ = ] [batchSizeInBytes_ = 0] [numberOfBatchesSent = 0] [averageBatchSize = 0]} BatchMessageContainer constructed
2019-10-31 12:40:15.485 INFO HandlerBase:52 | [persistent://public/default/pulsar-log, ] Getting connection from pool
12:40:15.487 [pulsar-io-50-7] INFO org.apache.pulsar.broker.service.ServerCnx - [/10.85.179.103:37928][persistent://public/default/pulsar-log] Creating producer. producerId=0
12:40:15.487 [ForkJoinPool.commonPool-worker-2] INFO org.apache.pulsar.broker.service.ServerCnx - [/10.85.179.103:37928]-0 persistent://public/default/pulsar-log configured with schema false
12:40:15.488 [ForkJoinPool.commonPool-worker-2] INFO org.apache.pulsar.broker.service.ServerCnx - [/10.85.179.103:37928] Created new producer: Producer{topic=PersistentTopic{topic=persistent://public/default/pulsar-log}, client=/10.85.179.103:37928, producerName=standalone-1-143, producerId=0}
2019-10-31 12:40:15.488 INFO ProducerImpl:155 | [persistent://public/default/pulsar-log, ] Created producer on broker [10.85.179.103:37928 -> 10.85.179.103:6650]
2019-10-31 12:40:15.521 INFO Client:88 | Subscribing on Topic :my-topic
2019-10-31 12:40:15.522 INFO HandlerBase:52 | [persistent://public/default/my-topic, public/default/womble, 0] Getting connection from pool
12:40:15.523 [pulsar-io-50-7] INFO org.apache.pulsar.broker.service.ServerCnx - [/10.85.179.103:37928] Subscribing on topic persistent://public/default/my-topic / public/default/womble
12:40:15.524 [pulsar-io-50-7] INFO org.apache.bookkeeper.mledger.impl.ManagedCursorImpl - [public/default/persistent/my-topic-public%2Fdefault%2Fwomble] Rewind from 36531:29 to 36531:9
12:40:15.524 [pulsar-io-50-7] INFO org.apache.pulsar.broker.service.persistent.PersistentTopic - [persistent://public/default/my-topic] There are no replicated subscriptions on the topic
12:40:15.524 [pulsar-io-50-7] INFO org.apache.pulsar.broker.service.persistent.PersistentTopic - [persistent://public/default/my-topic][public/default/womble] Created new subscription for 0
12:40:15.524 [pulsar-io-50-7] INFO org.apache.pulsar.broker.service.ServerCnx - [/10.85.179.103:37928] Created subscription on topic persistent://public/default/my-topic / public/default/womble
2019-10-31 12:40:15.524 INFO ConsumerImpl:170 | [persistent://public/default/my-topic, public/default/womble, 0] Created consumer on broker [10.85.179.103:37928 -> 10.85.179.103:6650]
12:40:15.632 [pulsar-io-50-7] INFO org.apache.pulsar.broker.service.ServerCnx - Closed connection from /10.85.179.103:37928
12:40:15.633 [pulsar-io-50-7] INFO org.apache.pulsar.broker.service.persistent.PersistentDispatcherMultipleConsumers - Removed consumer Consumer{subscription=PersistentSubscription{topic=persistent://public/default/my-topic, name=public/default/womble}, consumerId=0, consumerName=b4b94f, address=/10.85.179.103:37928} with pending 20 acks
```
And so on.
Using localrun mode isn't any better:
```
$ apache-pulsar-2.4.1/bin/pulsar-admin functions localrun --name womble2 --inputs my-topic --py /home/ubuntu/func1.py --classname FirstFunction --log-topic pulsar-log
12:44:32.943 [main] INFO org.apache.pulsar.functions.runtime.ProcessRuntimeFactory - Java instance jar location is not defined, using the location defined in system environment : /home/ubuntu/apache-pulsar-2.4.1/instances/java-instance.jar
12:44:32.947 [main] INFO org.apache.pulsar.functions.runtime.ProcessRuntimeFactory - Python instance file location is not defined using the location defined in system environment : /home/ubuntu/apache-pulsar-2.4.1/instances/python-instance/python_instance_main.py
12:44:32.948 [main] INFO org.apache.pulsar.functions.runtime.ProcessRuntimeFactory - No extra dependencies location is defined in either function worker config or system environment
12:44:33.085 [main] INFO org.apache.pulsar.functions.runtime.RuntimeSpawner - public/default/womble2-0 RuntimeSpawner starting function
12:44:33.086 [main] INFO org.apache.pulsar.functions.runtime.ProcessRuntime - Creating function log directory /home/ubuntu/apache-pulsar-2.4.1/logs/functions/public/default/womble2
12:44:33.088 [main] INFO org.apache.pulsar.functions.runtime.ProcessRuntime - Created or found function log directory /home/ubuntu/apache-pulsar-2.4.1/logs/functions/public/default/womble2
12:44:33.088 [main] INFO org.apache.pulsar.functions.runtime.ProcessRuntime - ProcessBuilder starting the process with args python /home/ubuntu/apache-pulsar-2.4.1/instances/python-instance/python_instance_main.py --py /home/ubuntu/func1.py --logging_directory /home/ubuntu/apache-pulsar-2.4.1/logs/functions --logging_file womble2 --logging_config_file /home/ubuntu/apache-pulsar-2.4.1/conf/functions-logging/logging_config.ini --instance_id 0 --function_id a776f9d5-db5e-4b59-9b01-ab7661ea0df2 --function_version 48bb96e4-69a9-43f9-b880-7ae89d41b0e3 --function_details '{"tenant":"public","namespace":"default","name":"womble2","className":"FirstFunction","logTopic":"pulsar-log","runtime":"PYTHON","autoAck":true,"parallelism":1,"source":{"inputSpecs":{"my-topic":{}},"cleanupSubscription":true},"sink":{},"resources":{"cpu":1.0,"ram":"1073741824","disk":"10737418240"},"componentType":"FUNCTION"}' --pulsar_serviceurl pulsar://localhost:6650 --use_tls false --tls_allow_insecure false --hostname_verification_enabled false --max_buffered_tuples 1024 --port 46365 --metrics_port 44849 --expected_healthcheck_interval 30 --secrets_provider secretsprovider.ClearTextSecretsProvider --cluster_name local
12:44:33.104 [main] INFO org.apache.pulsar.functions.runtime.ProcessRuntime - Started process successfully
2019-10-31 12:44:33.546 INFO ConnectionPool:72 | Created connection for pulsar://localhost:6650
2019-10-31 12:44:33.547 INFO ClientConnection:324 | [127.0.0.1:52338 -> 127.0.0.1:6650] Connected to broker
2019-10-31 12:44:33.549 INFO BatchMessageContainer:43 | { BatchContainer [size = 0] [batchSizeInBytes_ = 0] [maxAllowedMessageBatchSizeInBytes_ = 131072] [maxAllowedNumMessagesInBatch_ = 1000] [topicName = persistent://public/default/pulsar-log] [producerName_ = ] [batchSizeInBytes_ = 0] [numberOfBatchesSent = 0] [averageBatchSize = 0]} BatchMessageContainer constructed
2019-10-31 12:44:33.549 INFO HandlerBase:52 | [persistent://public/default/pulsar-log, ] Getting connection from pool
2019-10-31 12:44:33.550 INFO ConnectionPool:72 | Created connection for pulsar://ldex-pulsar.int.example.net:6650
2019-10-31 12:44:33.550 INFO ClientConnection:326 | [127.0.0.1:52340 -> 127.0.0.1:6650] Connected to broker through proxy. Logical broker: pulsar://ldex-pulsar.int.example.net:6650
2019-10-31 12:44:33.551 INFO ProducerImpl:155 | [persistent://public/default/pulsar-log, ] Created producer on broker [127.0.0.1:52340 -> 127.0.0.1:6650]
2019-10-31 12:44:33.553 INFO Client:88 | Subscribing on Topic :my-topic
2019-10-31 12:44:33.554 INFO HandlerBase:52 | [persistent://public/default/my-topic, public/default/womble2, 0] Getting connection from pool
2019-10-31 12:44:33.626 INFO ConsumerImpl:170 | [persistent://public/default/my-topic, public/default/womble2, 0] Created consumer on broker [127.0.0.1:52340 -> 127.0.0.1:6650]
12:44:33.703 [main] INFO org.apache.pulsar.functions.LocalRunner - RuntimeSpawner quit because of
<< hangs for 30 seconds here - right after "because of" !! >>
12:45:03.838 [function-timer-thread-3-1] ERROR org.apache.pulsar.functions.runtime.ProcessRuntime - Health check failed for womble2-0
java.util.concurrent.ExecutionException: io.grpc.StatusRuntimeException: UNAVAILABLE: io exception
at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) ~[?:1.8.0_222]
at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895) ~[?:1.8.0_222]
at org.apache.pulsar.functions.runtime.ProcessRuntime.lambda$start$1(ProcessRuntime.java:164) ~[org.apache.pulsar-pulsar-functions-runtime-2.4.1.jar:2.4.1]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_222]
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [?:1.8.0_222]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_222]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [?:1.8.0_222]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_222]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_222]
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_222]
Caused by: io.grpc.StatusRuntimeException: UNAVAILABLE: io exception
at io.grpc.Status.asRuntimeException(Status.java:530) ~[io.grpc-grpc-core-1.18.0.jar:1.18.0]
at io.grpc.stub.ClientCalls$UnaryStreamToFuture.onClose(ClientCalls.java:482) ~[io.grpc-grpc-stub-1.18.0.jar:1.18.0]
at io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39) ~[io.grpc-grpc-core-1.18.0.jar:1.18.0]
at io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23) ~[io.grpc-grpc-core-1.18.0.jar:1.18.0]
at io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40) ~[io.grpc-grpc-core-1.18.0.jar:1.18.0]
at io.grpc.internal.CensusStatsModule$StatsClientInterceptor$1$1.onClose(CensusStatsModule.java:699) ~[io.grpc-grpc-core-1.18.0.jar:1.18.0]
at io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39) ~[io.grpc-grpc-core-1.18.0.jar:1.18.0]
at io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23) ~[io.grpc-grpc-core-1.18.0.jar:1.18.0]
at io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40) ~[io.grpc-grpc-core-1.18.0.jar:1.18.0]
at io.grpc.internal.CensusTracingModule$TracingClientInterceptor$1$1.onClose(CensusTracingModule.java:397) ~[io.grpc-grpc-core-1.18.0.jar:1.18.0]
at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:459) ~[io.grpc-grpc-core-1.18.0.jar:1.18.0]
at io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:63) ~[io.grpc-grpc-core-1.18.0.jar:1.18.0]
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.close(ClientCallImpl.java:546) ~[io.grpc-grpc-core-1.18.0.jar:1.18.0]
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.access$600(ClientCallImpl.java:467) ~[io.grpc-grpc-core-1.18.0.jar:1.18.0]
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:584) ~[io.grpc-grpc-core-1.18.0.jar:1.18.0]
at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37) ~[io.grpc-grpc-core-1.18.0.jar:1.18.0]
at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123) ~[io.grpc-grpc-core-1.18.0.jar:1.18.0]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_222]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_222]
... 1 more
Caused by: io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: /127.0.0.1:46365
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[?:1.8.0_222]
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) ~[?:1.8.0_222]
at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:327) ~[io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final]
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:340) ~[io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:644) ~[io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:591) ~[io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:508) ~[io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:470) ~[io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:909) ~[io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final]
... 2 more
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[?:1.8.0_222]
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) ~[?:1.8.0_222]
at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:327) ~[io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final]
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:340) ~[io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:644) ~[io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:591) ~[io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:508) ~[io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:470) ~[io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:909) ~[io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final]
... 2 more
12:45:03.864 [function-timer-thread-3-1] ERROR org.apache.pulsar.functions.runtime.ProcessRuntime - Extracted Process death exception
java.lang.RuntimeException:
at org.apache.pulsar.functions.runtime.ProcessRuntime.tryExtractingDeathException(ProcessRuntime.java:380) ~[org.apache.pulsar-pulsar-functions-runtime-2.4.1.jar:2.4.1]
at org.apache.pulsar.functions.runtime.ProcessRuntime.isAlive(ProcessRuntime.java:367) ~[org.apache.pulsar-pulsar-functions-runtime-2.4.1.jar:2.4.1]
at org.apache.pulsar.functions.runtime.RuntimeSpawner.lambda$start$0(RuntimeSpawner.java:88) ~[org.apache.pulsar-pulsar-functions-runtime-2.4.1.jar:2.4.1]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_222]
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [?:1.8.0_222]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_222]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [?:1.8.0_222]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_222]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_222]
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_222]
12:45:03.866 [function-timer-thread-3-1] ERROR org.apache.pulsar.functions.runtime.RuntimeSpawner - public/default/womble2-java.lang.RuntimeException: Function Container is dead with exception.. restarting
12:45:03.872 [function-timer-thread-3-1] INFO org.apache.pulsar.functions.runtime.ProcessRuntime - Creating function log directory /home/ubuntu/apache-pulsar-2.4.1/logs/functions/public/default/womble2
12:45:03.874 [function-timer-thread-3-1] INFO org.apache.pulsar.functions.runtime.ProcessRuntime - Created or found function log directory /home/ubuntu/apache-pulsar-2.4.1/logs/functions/public/default/womble2
12:45:03.875 [function-timer-thread-3-1] INFO org.apache.pulsar.functions.runtime.ProcessRuntime - ProcessBuilder starting the process with args python /home/ubuntu/apache-pulsar-2.4.1/instances/python-instance/python_instance_main.py --py /home/ubuntu/func1.py --logging_directory /home/ubuntu/apache-pulsar-2.4.1/logs/functions --logging_file womble2 --logging_config_file /home/ubuntu/apache-pulsar-2.4.1/conf/functions-logging/logging_config.ini --instance_id 0 --function_id a776f9d5-db5e-4b59-9b01-ab7661ea0df2 --function_version 48bb96e4-69a9-43f9-b880-7ae89d41b0e3 --function_details '{"tenant":"public","namespace":"default","name":"womble2","className":"FirstFunction","logTopic":"pulsar-log","runtime":"PYTHON","autoAck":true,"parallelism":1,"source":{"inputSpecs":{"my-topic":{}},"cleanupSubscription":true},"sink":{},"resources":{"cpu":1.0,"ram":"1073741824","disk":"10737418240"},"componentType":"FUNCTION"}' --pulsar_serviceurl pulsar://localhost:6650 --use_tls false --tls_allow_insecure false --hostname_verification_enabled false --max_buffered_tuples 1024 --port 46365 --metrics_port 44849 --expected_healthcheck_interval 30 --secrets_provider secretsprovider.ClearTextSecretsProvider --cluster_name local
12:45:03.886 [function-timer-thread-3-1] INFO org.apache.pulsar.functions.runtime.ProcessRuntime - Started process successfully
12:45:03.912 [Timer-0] INFO org.apache.pulsar.functions.LocalRunner - {
"failureException": "UNAVAILABLE: io exception",
"instanceId": "0"
}
2019-10-31 12:45:04.859 INFO ConnectionPool:72 | Created connection for pulsar://localhost:6650
2019-10-31 12:45:04.867 INFO ClientConnection:324 | [127.0.0.1:52910 -> 127.0.0.1:6650] Connected to broker
2019-10-31 12:45:04.878 INFO BatchMessageContainer:43 | { BatchContainer [size = 0] [batchSizeInBytes_ = 0] [maxAllowedMessageBatchSizeInBytes_ = 131072] [maxAllowedNumMessagesInBatch_ = 1000] [topicName = persistent://public/default/pulsar-log] [producerName_ = ] [batchSizeInBytes_ = 0] [numberOfBatchesSent = 0] [averageBatchSize = 0]} BatchMessageContainer constructed
2019-10-31 12:45:04.878 INFO HandlerBase:52 | [persistent://public/default/pulsar-log, ] Getting connection from pool
2019-10-31 12:45:04.887 INFO ConnectionPool:72 | Created connection for pulsar://ldex-pulsar.int.example.net:6650
2019-10-31 12:45:04.887 INFO ClientConnection:326 | [127.0.0.1:52912 -> 127.0.0.1:6650] Connected to broker through proxy. Logical broker: pulsar://ldex-pulsar.int.example.net:6650
2019-10-31 12:45:04.890 INFO ProducerImpl:155 | [persistent://public/default/pulsar-log, ] Created producer on broker [127.0.0.1:52912 -> 127.0.0.1:6650]
2019-10-31 12:45:04.900 INFO Client:88 | Subscribing on Topic :my-topic
2019-10-31 12:45:04.902 INFO HandlerBase:52 | [persistent://public/default/my-topic, public/default/womble2, 0] Getting connection from pool
2019-10-31 12:45:04.914 INFO ConsumerImpl:170 | [persistent://public/default/my-topic, public/default/womble2, 0] Created consumer on broker [127.0.0.1:52912 -> 127.0.0.1:6650]
^C12:45:12.279 [Thread-1] INFO org.apache.pulsar.functions.LocalRunner - Shutting down the localrun runtimeSpawner ...
```
Notice the 30 second delay as it was trying to report the exception.
Running localrun under strace, finally I can see the backtrace:
```
$ strace -f -s128 apache-pulsar-2.4.1/bin/pulsar-admin functions localrun --name womble2 --inputs my-topic --py /home/ubuntu/func1.py --classname FirstFunction --log-topic pulsar-log 2>/tmp/strace.out
...
...
[pid 17173] read(10, "#!/usr/bin/env python3\n#\n# Licensed to the Apache Software Foundation (ASF) under one\n# or more contributor license agreements. "..., 40
96) = 4096
[pid 17173] close(10) = 0
[pid 17173] lseek(9, 0, SEEK_SET) = 0
[pid 17173] lseek(9, 0, SEEK_CUR) = 0
[pid 17173] read(9, "#!/usr/bin/env python3\n#\n# Licensed to the Apache Software Foundation (ASF) under one\n# or more contributor license agreements. "..., 819
2) = 8192
[pid 17173] read(9, "Function Module %s\" % self.instance_config.function_details.className)\n raise NameError(\"Could not import User Function Mod"..., 819
2) = 8192
[pid 17173] close(9) = 0
[pid 17173] getpid() = 17173
[pid 17173] lseek(3, 0, SEEK_END) = 16934
[pid 17173] lseek(3, 0, SEEK_CUR) = 16934
[pid 17173] write(3, "[2019-10-31 12:48:33 +0000] [ERROR] log.py: raise NameError(\"Could not import User Function Module %s\" % self.instance_config.fu"..., 154
) = 154
[pid 17173] getpid() = 17173
[pid 17173] lseek(3, 0, SEEK_END) = 17088
[pid 17173] lseek(3, 0, SEEK_CUR) = 17088
[pid 17173] write(3, "[2019-10-31 12:48:33 +0000] [ERROR] log.py: NameError\n", 54) = 54
[pid 17173] getpid() = 17173
[pid 17173] lseek(3, 0, SEEK_END) = 17142
[pid 17173] lseek(3, 0, SEEK_CUR) = 17142
[pid 17173] write(3, "[2019-10-31 12:48:33 +0000] [ERROR] log.py: :\n", 46) = 46
[pid 17173] getpid() = 17173
[pid 17173] lseek(3, 0, SEEK_END) = 17188
[pid 17173] lseek(3, 0, SEEK_CUR) = 17188
[pid 17173] write(3, "[2019-10-31 12:48:33 +0000] [ERROR] log.py: Could not import User Function Module FirstFunction\n", 96) = 96
[pid 17173] close(3) = 0
...
[pid 17174] read(3, "\312\376\272\276\0\0\0004\0[\7\0*\n\0+\0,\n\0\1\0-\n\0\24\0.\7\0/\n\0\5\0000\t\0\23\0001\n\0002\0003\n\0004\0005\n\0002\0006\t\0\23\0007\n\0
\23\0008\n\0002\0009\t\0:\0;\7\0<\n\0\17\0=\7\0>\n\0\24\0009\7\0?\7\0B\1\0\tcloseLock\1\0\22Ljava/lang/Obje"..., 1711) = 1711
[pid 17158] write(1, "12:48:33.287 [main] INFO org.apache.pulsar.functions.LocalRunner - RuntimeSpawner quit because of\n", 99 <unfinished ...>
```
**Expected behavior**
I would expect the stderr which my function generates before it dies, to be captured and displayed in logs somewhere.
**Screenshots**
See above
**Desktop (please complete the following information):**
- Ubuntu 18.04 (running in lxd container)
**Additional context**
The fact that my function doesn't start could be a whole bunch of reasons - `log.py` suggests it's because I'm trying to use log-topic - but that's a separate issue that I need to debug.
The issue here is that the message which might help has not been kept.
I can see there is other information which is being logged but lost. That error comes from `python_instance.py` which in turn calls `import_class()` from `util.py`
```
def import_class(from_path, full_class_name):
from_path = str(from_path)
full_class_name = str(full_class_name)
try:
return import_class_from_path(from_path, full_class_name)
except Exception as e:
our_dir = os.path.dirname(os.path.abspath(inspect.getfile(inspect.currentframe())))
api_dir = os.path.join(our_dir, PULSAR_API_ROOT, PULSAR_FUNCTIONS_API_ROOT)
try:
return import_class_from_path(api_dir, full_class_name)
except Exception as e:
Log.info("Failed to import class %s from path %s" % (full_class_name, from_path))
Log.info(e, exc_info=True)
return None
```
When this function returns None there should be a Log.info(). Looking in the strace output I see the detail there:
```
$ grep "Failed to import class" /tmp/strace.out
[pid 17173] write(3, "[2019-10-31 12:48:33 +0000] [INFO] util.py: Failed to import class FirstFunction from path /home/ubuntu\n", 104) = 104
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services