You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2019/10/31 13:05:08 UTC

[GitHub] [pulsar] candlerb opened a new issue #5521: Python exception message not shown when python function fails to start

candlerb opened a new issue #5521: Python exception message not shown when python function fails to start
URL: https://github.com/apache/pulsar/issues/5521
 
 
   **Describe the bug**
   I have a problem starting a python function.  However, the actual python error message (reported on stderr) is being lost.  I can only see it if I "strace" the entire pulsar stack.
   
   **To Reproduce**
   I am using pulsar functions with pulsar 2.4.1 in standalone mode, base OS is Ubuntu 18.04, and I've done the frig to link "python" to "python3":
   
   ```
   sudo update-alternatives --install /usr/bin/python python /usr/bin/python3 10
   ```
   
   Here's my function:
   
   ```
   $ cat func1.py
   from pulsar import Function
   
   class FirstFunction(Function):
       def process(self, item, context):
           log = context.get_logger()
           log.info("Got %r with properties %r" % (item, context.get_message_properties()))
   ```
   
   I can run it from the command line and it just returns without error.
   
   Deployed:
   
   ```
   $ apache-pulsar-2.4.1/bin/pulsar-admin functions create --name womble --inputs my-topic --py /home/ubuntu/func1.py --classname FirstFunction --log-topic pulsar-log
   
   $ apache-pulsar-2.4.1/bin/pulsar-admin functions status --name womble
   {
     "numInstances" : 1,
     "numRunning" : 0,
     "instances" : [ {
       "instanceId" : 0,
       "status" : {
         "running" : false,
         "error" : "",
         "numRestarts" : 222,
         "numReceived" : 0,
         "numSuccessfullyProcessed" : 0,
         "numUserExceptions" : 0,
         "latestUserExceptions" : [ ],
         "numSystemExceptions" : 0,
         "latestSystemExceptions" : [ ],
         "averageLatency" : 0.0,
         "lastInvocationTime" : 0,
         "workerId" : "c-standalone-fw-ldex-pulsar.int.example.net-8080"
       }
     } ]
   }
   ```
   
   Looking at the pulsar process output, I see it restarting every 30 seconds but no indication of why it's dying.  There's a "DeathException" but no further info - not even the process exit code as fas as I can see.
   
   ```
   12:39:45.003 [function-timer-thread-94-1] ERROR org.apache.pulsar.functions.runtime.RuntimeSpawner - public/default/womble-java.lang.RuntimeException:  Function
   Container is dead with exception.. restarting
   12:39:45.003 [function-timer-thread-94-1] INFO  org.apache.pulsar.functions.runtime.ProcessRuntime - Creating function log directory /home/ubuntu/apache-pulsar-2
   .4.1/logs/functions/public/default/womble
   12:39:45.003 [function-timer-thread-94-1] INFO  org.apache.pulsar.functions.runtime.ProcessRuntime - Created or found function log directory /home/ubuntu/apache-
   pulsar-2.4.1/logs/functions/public/default/womble
   12:39:45.003 [function-timer-thread-94-1] INFO  org.apache.pulsar.functions.runtime.ProcessRuntime - ProcessBuilder starting the process with args python /home/u
   buntu/apache-pulsar-2.4.1/instances/python-instance/python_instance_main.py --py /tmp/pulsar_functions/public/default/womble/0/func1.py --logging_directory /home
   /ubuntu/apache-pulsar-2.4.1/logs/functions --logging_file womble --logging_config_file /home/ubuntu/apache-pulsar-2.4.1/conf/functions-logging/logging_config.ini
    --instance_id 0 --function_id 67fb9828-56ab-49ff-bbbc-56db9510d26c --function_version 4f01d696-28da-40b7-b9f5-2283fbe9bd7c --function_details '{"tenant":"public
   ","namespace":"default","name":"womble","className":"FirstFunction","logTopic":"pulsar-log","runtime":"PYTHON","autoAck":true,"parallelism":1,"source":{"inputSpe
   cs":{"my-topic":{}},"cleanupSubscription":true},"sink":{},"resources":{"cpu":1.0,"ram":"1073741824","disk":"10737418240"},"componentType":"FUNCTION"}' --pulsar_s
   erviceurl pulsar://ldex-pulsar.int.example.net:6650 --max_buffered_tuples 1024 --port 38423 --metrics_port 36591 --state_storage_serviceurl bk://127.0.0.1:418
   1 --expected_healthcheck_interval 30 --secrets_provider secretsprovider.ClearTextSecretsProvider --cluster_name standalone
   12:39:45.005 [function-timer-thread-94-1] INFO  org.apache.pulsar.functions.runtime.ProcessRuntime - Started process successfully
   2019-10-31 12:39:45.478 INFO  ConnectionPool:72 | Created connection for pulsar://ldex-pulsar.int.example.net:6650
   2019-10-31 12:39:45.480 INFO  ClientConnection:324 | [10.85.179.103:37476 -> 10.85.179.103:6650] Connected to broker
   12:39:45.480 [pulsar-io-50-6] INFO  org.apache.pulsar.broker.service.ServerCnx - New connection from /10.85.179.103:37476
   2019-10-31 12:39:45.481 INFO  BatchMessageContainer:43 | { BatchContainer [size = 0] [batchSizeInBytes_ = 0] [maxAllowedMessageBatchSizeInBytes_ = 131072] [maxAl
   lowedNumMessagesInBatch_ = 1000] [topicName = persistent://public/default/pulsar-log] [producerName_ = ] [batchSizeInBytes_ = 0] [numberOfBatchesSent = 0] [avera
   geBatchSize = 0]} BatchMessageContainer constructed
   2019-10-31 12:39:45.481 INFO  HandlerBase:52 | [persistent://public/default/pulsar-log, ] Getting connection from pool
   12:39:45.482 [pulsar-io-50-6] INFO  org.apache.pulsar.broker.service.ServerCnx - [/10.85.179.103:37476][persistent://public/default/pulsar-log] Creating producer
   . producerId=0
   12:39:45.483 [ForkJoinPool.commonPool-worker-1] INFO  org.apache.pulsar.broker.service.ServerCnx - [/10.85.179.103:37476]-0 persistent://public/default/pulsar-lo
   g configured with schema false
   12:39:45.483 [ForkJoinPool.commonPool-worker-1] INFO  org.apache.pulsar.broker.service.ServerCnx - [/10.85.179.103:37476] Created new producer: Producer{topic=Pe
   rsistentTopic{topic=persistent://public/default/pulsar-log}, client=/10.85.179.103:37476, producerName=standalone-1-142, producerId=0}
   2019-10-31 12:39:45.483 INFO  ProducerImpl:155 | [persistent://public/default/pulsar-log, ] Created producer on broker [10.85.179.103:37476 -> 10.85.179.103:6650
   ]
   2019-10-31 12:39:45.504 INFO  Client:88 | Subscribing on Topic :my-topic
   2019-10-31 12:39:45.505 INFO  HandlerBase:52 | [persistent://public/default/my-topic, public/default/womble, 0] Getting connection from pool
   12:39:45.505 [pulsar-io-50-6] INFO  org.apache.pulsar.broker.service.ServerCnx - [/10.85.179.103:37476] Subscribing on topic persistent://public/default/my-topic
    / public/default/womble
   12:39:45.505 [pulsar-io-50-6] INFO  org.apache.bookkeeper.mledger.impl.ManagedCursorImpl - [public/default/persistent/my-topic-public%2Fdefault%2Fwomble] Rewind
   from 36531:29 to 36531:9
   12:39:45.506 [pulsar-io-50-6] INFO  org.apache.pulsar.broker.service.persistent.PersistentTopic - [persistent://public/default/my-topic] There are no replicated
   subscriptions on the topic
   12:39:45.506 [pulsar-io-50-6] INFO  org.apache.pulsar.broker.service.persistent.PersistentTopic - [persistent://public/default/my-topic][public/default/womble] C
   reated new subscription for 0
   12:39:45.506 [pulsar-io-50-6] INFO  org.apache.pulsar.broker.service.ServerCnx - [/10.85.179.103:37476] Created subscription on topic persistent://public/default
   /my-topic / public/default/womble
   2019-10-31 12:39:45.506 INFO  ConsumerImpl:170 | [persistent://public/default/my-topic, public/default/womble, 0] Created consumer on broker [10.85.179.103:37476
    -> 10.85.179.103:6650]
   12:39:45.598 [pulsar-io-50-6] INFO  org.apache.pulsar.broker.service.ServerCnx - Closed connection from /10.85.179.103:37476
   12:39:45.598 [pulsar-io-50-6] INFO  org.apache.pulsar.broker.service.persistent.PersistentDispatcherMultipleConsumers - Removed consumer Consumer{subscription=Pe
   rsistentSubscription{topic=persistent://public/default/my-topic, name=public/default/womble}, consumerId=0, consumerName=f01655, address=/10.85.179.103:37476} with pending 20 acks
   12:40:00.754 [pulsar-web-57-5] INFO  org.eclipse.jetty.server.RequestLog - 10.85.179.103 - - [31/Oct/2019:12:40:00 +0000] "GET /admin/v2/persistent/public/functi
   ons/coordinate/stats HTTP/1.1" 200 851 "-" "Pulsar-Java-v2.4.1" 2
   12:40:15.002 [function-timer-thread-94-1] ERROR org.apache.pulsar.functions.runtime.ProcessRuntime - Extracted Process death exception
   java.lang.RuntimeException:
           at org.apache.pulsar.functions.runtime.ProcessRuntime.tryExtractingDeathException(ProcessRuntime.java:380) ~[org.apache.pulsar-pulsar-functions-runtime-2.4.1.jar:2.4.1]
           at org.apache.pulsar.functions.runtime.ProcessRuntime.isAlive(ProcessRuntime.java:367) ~[org.apache.pulsar-pulsar-functions-runtime-2.4.1.jar:2.4.1]
           at org.apache.pulsar.functions.runtime.RuntimeSpawner.lambda$start$0(RuntimeSpawner.java:88) ~[org.apache.pulsar-pulsar-functions-runtime-2.4.1.jar:2.4.1]
           at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_222]
           at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [?:1.8.0_222]
           at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_222]
           at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [?:1.8.0_222]
           at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_222]
           at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_222]
           at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final]
           at java.lang.Thread.run(Thread.java:748) [?:1.8.0_222]
   12:40:15.003 [function-timer-thread-94-1] ERROR org.apache.pulsar.functions.runtime.RuntimeSpawner - public/default/womble-java.lang.RuntimeException:  Function Container is dead with exception.. restarting
   12:40:15.003 [function-timer-thread-94-1] INFO  org.apache.pulsar.functions.runtime.ProcessRuntime - Creating function log directory /home/ubuntu/apache-pulsar-2.4.1/logs/functions/public/default/womble
   12:40:15.003 [function-timer-thread-94-1] INFO  org.apache.pulsar.functions.runtime.ProcessRuntime - Created or found function log directory /home/ubuntu/apache-pulsar-2.4.1/logs/functions/public/default/womble
   12:40:15.004 [function-timer-thread-94-1] INFO  org.apache.pulsar.functions.runtime.ProcessRuntime - ProcessBuilder starting the process with args python /home/ubuntu/apache-pulsar-2.4.1/instances/python-instance/python_instance_main.py --py /tmp/pulsar_functions/public/default/womble/0/func1.py --logging_directory /home/ubuntu/apache-pulsar-2.4.1/logs/functions --logging_file womble --logging_config_file /home/ubuntu/apache-pulsar-2.4.1/conf/functions-logging/logging_config.ini --instance_id 0 --function_id 67fb9828-56ab-49ff-bbbc-56db9510d26c --function_version 4f01d696-28da-40b7-b9f5-2283fbe9bd7c --function_details '{"tenant":"public","namespace":"default","name":"womble","className":"FirstFunction","logTopic":"pulsar-log","runtime":"PYTHON","autoAck":true,"parallelism":1,"source":{"inputSpecs":{"my-topic":{}},"cleanupSubscription":true},"sink":{},"resources":{"cpu":1.0,"ram":"1073741824","disk":"10737418240"},"componentType":"FUNCTION"}' --pulsar_serviceurl pulsar://ldex-pulsar.int.example.net:6650 --max_buffered_tuples 1024 --port 38423 --metrics_port 36591 --state_storage_serviceurl bk://127.0.0.1:4181 --expected_healthcheck_interval 30 --secrets_provider secretsprovider.ClearTextSecretsProvider --cluster_name standalone
   12:40:15.010 [function-timer-thread-94-1] INFO  org.apache.pulsar.functions.runtime.ProcessRuntime - Started process successfully
   2019-10-31 12:40:15.481 INFO  ConnectionPool:72 | Created connection for pulsar://ldex-pulsar.int.example.net:6650
   2019-10-31 12:40:15.483 INFO  ClientConnection:324 | [10.85.179.103:37928 -> 10.85.179.103:6650] Connected to broker
   12:40:15.484 [pulsar-io-50-7] INFO  org.apache.pulsar.broker.service.ServerCnx - New connection from /10.85.179.103:37928
   2019-10-31 12:40:15.485 INFO  BatchMessageContainer:43 | { BatchContainer [size = 0] [batchSizeInBytes_ = 0] [maxAllowedMessageBatchSizeInBytes_ = 131072] [maxAllowedNumMessagesInBatch_ = 1000] [topicName = persistent://public/default/pulsar-log] [producerName_ = ] [batchSizeInBytes_ = 0] [numberOfBatchesSent = 0] [averageBatchSize = 0]} BatchMessageContainer constructed
   2019-10-31 12:40:15.485 INFO  HandlerBase:52 | [persistent://public/default/pulsar-log, ] Getting connection from pool
   12:40:15.487 [pulsar-io-50-7] INFO  org.apache.pulsar.broker.service.ServerCnx - [/10.85.179.103:37928][persistent://public/default/pulsar-log] Creating producer. producerId=0
   12:40:15.487 [ForkJoinPool.commonPool-worker-2] INFO  org.apache.pulsar.broker.service.ServerCnx - [/10.85.179.103:37928]-0 persistent://public/default/pulsar-log configured with schema false
   12:40:15.488 [ForkJoinPool.commonPool-worker-2] INFO  org.apache.pulsar.broker.service.ServerCnx - [/10.85.179.103:37928] Created new producer: Producer{topic=PersistentTopic{topic=persistent://public/default/pulsar-log}, client=/10.85.179.103:37928, producerName=standalone-1-143, producerId=0}
   2019-10-31 12:40:15.488 INFO  ProducerImpl:155 | [persistent://public/default/pulsar-log, ] Created producer on broker [10.85.179.103:37928 -> 10.85.179.103:6650]
   2019-10-31 12:40:15.521 INFO  Client:88 | Subscribing on Topic :my-topic
   2019-10-31 12:40:15.522 INFO  HandlerBase:52 | [persistent://public/default/my-topic, public/default/womble, 0] Getting connection from pool
   12:40:15.523 [pulsar-io-50-7] INFO  org.apache.pulsar.broker.service.ServerCnx - [/10.85.179.103:37928] Subscribing on topic persistent://public/default/my-topic / public/default/womble
   12:40:15.524 [pulsar-io-50-7] INFO  org.apache.bookkeeper.mledger.impl.ManagedCursorImpl - [public/default/persistent/my-topic-public%2Fdefault%2Fwomble] Rewind from 36531:29 to 36531:9
   12:40:15.524 [pulsar-io-50-7] INFO  org.apache.pulsar.broker.service.persistent.PersistentTopic - [persistent://public/default/my-topic] There are no replicated subscriptions on the topic
   12:40:15.524 [pulsar-io-50-7] INFO  org.apache.pulsar.broker.service.persistent.PersistentTopic - [persistent://public/default/my-topic][public/default/womble] Created new subscription for 0
   12:40:15.524 [pulsar-io-50-7] INFO  org.apache.pulsar.broker.service.ServerCnx - [/10.85.179.103:37928] Created subscription on topic persistent://public/default/my-topic / public/default/womble
   2019-10-31 12:40:15.524 INFO  ConsumerImpl:170 | [persistent://public/default/my-topic, public/default/womble, 0] Created consumer on broker [10.85.179.103:37928 -> 10.85.179.103:6650]
   12:40:15.632 [pulsar-io-50-7] INFO  org.apache.pulsar.broker.service.ServerCnx - Closed connection from /10.85.179.103:37928
   12:40:15.633 [pulsar-io-50-7] INFO  org.apache.pulsar.broker.service.persistent.PersistentDispatcherMultipleConsumers - Removed consumer Consumer{subscription=PersistentSubscription{topic=persistent://public/default/my-topic, name=public/default/womble}, consumerId=0, consumerName=b4b94f, address=/10.85.179.103:37928} with pending 20 acks
   ```
   
   And so on.
   
   Using localrun mode isn't any better:
   
   ```
   $ apache-pulsar-2.4.1/bin/pulsar-admin functions localrun --name womble2 --inputs my-topic --py /home/ubuntu/func1.py --classname FirstFunction --log-topic pulsar-log
   12:44:32.943 [main] INFO  org.apache.pulsar.functions.runtime.ProcessRuntimeFactory - Java instance jar location is not defined, using the location defined in system environment : /home/ubuntu/apache-pulsar-2.4.1/instances/java-instance.jar
   12:44:32.947 [main] INFO  org.apache.pulsar.functions.runtime.ProcessRuntimeFactory - Python instance file location is not defined using the location defined in system environment : /home/ubuntu/apache-pulsar-2.4.1/instances/python-instance/python_instance_main.py
   12:44:32.948 [main] INFO  org.apache.pulsar.functions.runtime.ProcessRuntimeFactory - No extra dependencies location is defined in either function worker config or system environment
   12:44:33.085 [main] INFO  org.apache.pulsar.functions.runtime.RuntimeSpawner - public/default/womble2-0 RuntimeSpawner starting function
   12:44:33.086 [main] INFO  org.apache.pulsar.functions.runtime.ProcessRuntime - Creating function log directory /home/ubuntu/apache-pulsar-2.4.1/logs/functions/public/default/womble2
   12:44:33.088 [main] INFO  org.apache.pulsar.functions.runtime.ProcessRuntime - Created or found function log directory /home/ubuntu/apache-pulsar-2.4.1/logs/functions/public/default/womble2
   12:44:33.088 [main] INFO  org.apache.pulsar.functions.runtime.ProcessRuntime - ProcessBuilder starting the process with args python /home/ubuntu/apache-pulsar-2.4.1/instances/python-instance/python_instance_main.py --py /home/ubuntu/func1.py --logging_directory /home/ubuntu/apache-pulsar-2.4.1/logs/functions --logging_file womble2 --logging_config_file /home/ubuntu/apache-pulsar-2.4.1/conf/functions-logging/logging_config.ini --instance_id 0 --function_id a776f9d5-db5e-4b59-9b01-ab7661ea0df2 --function_version 48bb96e4-69a9-43f9-b880-7ae89d41b0e3 --function_details '{"tenant":"public","namespace":"default","name":"womble2","className":"FirstFunction","logTopic":"pulsar-log","runtime":"PYTHON","autoAck":true,"parallelism":1,"source":{"inputSpecs":{"my-topic":{}},"cleanupSubscription":true},"sink":{},"resources":{"cpu":1.0,"ram":"1073741824","disk":"10737418240"},"componentType":"FUNCTION"}' --pulsar_serviceurl pulsar://localhost:6650 --use_tls false --tls_allow_insecure false --hostname_verification_enabled false --max_buffered_tuples 1024 --port 46365 --metrics_port 44849 --expected_healthcheck_interval 30 --secrets_provider secretsprovider.ClearTextSecretsProvider --cluster_name local
   12:44:33.104 [main] INFO  org.apache.pulsar.functions.runtime.ProcessRuntime - Started process successfully
   2019-10-31 12:44:33.546 INFO  ConnectionPool:72 | Created connection for pulsar://localhost:6650
   2019-10-31 12:44:33.547 INFO  ClientConnection:324 | [127.0.0.1:52338 -> 127.0.0.1:6650] Connected to broker
   2019-10-31 12:44:33.549 INFO  BatchMessageContainer:43 | { BatchContainer [size = 0] [batchSizeInBytes_ = 0] [maxAllowedMessageBatchSizeInBytes_ = 131072] [maxAllowedNumMessagesInBatch_ = 1000] [topicName = persistent://public/default/pulsar-log] [producerName_ = ] [batchSizeInBytes_ = 0] [numberOfBatchesSent = 0] [averageBatchSize = 0]} BatchMessageContainer constructed
   2019-10-31 12:44:33.549 INFO  HandlerBase:52 | [persistent://public/default/pulsar-log, ] Getting connection from pool
   2019-10-31 12:44:33.550 INFO  ConnectionPool:72 | Created connection for pulsar://ldex-pulsar.int.example.net:6650
   2019-10-31 12:44:33.550 INFO  ClientConnection:326 | [127.0.0.1:52340 -> 127.0.0.1:6650] Connected to broker through proxy. Logical broker: pulsar://ldex-pulsar.int.example.net:6650
   2019-10-31 12:44:33.551 INFO  ProducerImpl:155 | [persistent://public/default/pulsar-log, ] Created producer on broker [127.0.0.1:52340 -> 127.0.0.1:6650]
   2019-10-31 12:44:33.553 INFO  Client:88 | Subscribing on Topic :my-topic
   2019-10-31 12:44:33.554 INFO  HandlerBase:52 | [persistent://public/default/my-topic, public/default/womble2, 0] Getting connection from pool
   2019-10-31 12:44:33.626 INFO  ConsumerImpl:170 | [persistent://public/default/my-topic, public/default/womble2, 0] Created consumer on broker [127.0.0.1:52340 -> 127.0.0.1:6650]
   12:44:33.703 [main] INFO  org.apache.pulsar.functions.LocalRunner - RuntimeSpawner quit because of
   
   << hangs for 30 seconds here - right after "because of" !! >>
   
   12:45:03.838 [function-timer-thread-3-1] ERROR org.apache.pulsar.functions.runtime.ProcessRuntime - Health check failed for womble2-0
   java.util.concurrent.ExecutionException: io.grpc.StatusRuntimeException: UNAVAILABLE: io exception
       at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) ~[?:1.8.0_222]
       at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895) ~[?:1.8.0_222]
       at org.apache.pulsar.functions.runtime.ProcessRuntime.lambda$start$1(ProcessRuntime.java:164) ~[org.apache.pulsar-pulsar-functions-runtime-2.4.1.jar:2.4.1]
       at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_222]
       at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [?:1.8.0_222]
       at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_222]
       at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [?:1.8.0_222]
       at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_222]
       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_222]
       at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final]
       at java.lang.Thread.run(Thread.java:748) [?:1.8.0_222]
   Caused by: io.grpc.StatusRuntimeException: UNAVAILABLE: io exception
       at io.grpc.Status.asRuntimeException(Status.java:530) ~[io.grpc-grpc-core-1.18.0.jar:1.18.0]
       at io.grpc.stub.ClientCalls$UnaryStreamToFuture.onClose(ClientCalls.java:482) ~[io.grpc-grpc-stub-1.18.0.jar:1.18.0]
       at io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39) ~[io.grpc-grpc-core-1.18.0.jar:1.18.0]
       at io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23) ~[io.grpc-grpc-core-1.18.0.jar:1.18.0]
       at io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40) ~[io.grpc-grpc-core-1.18.0.jar:1.18.0]
       at io.grpc.internal.CensusStatsModule$StatsClientInterceptor$1$1.onClose(CensusStatsModule.java:699) ~[io.grpc-grpc-core-1.18.0.jar:1.18.0]
       at io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39) ~[io.grpc-grpc-core-1.18.0.jar:1.18.0]
       at io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23) ~[io.grpc-grpc-core-1.18.0.jar:1.18.0]
       at io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40) ~[io.grpc-grpc-core-1.18.0.jar:1.18.0]
       at io.grpc.internal.CensusTracingModule$TracingClientInterceptor$1$1.onClose(CensusTracingModule.java:397) ~[io.grpc-grpc-core-1.18.0.jar:1.18.0]
       at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:459) ~[io.grpc-grpc-core-1.18.0.jar:1.18.0]
       at io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:63) ~[io.grpc-grpc-core-1.18.0.jar:1.18.0]
       at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.close(ClientCallImpl.java:546) ~[io.grpc-grpc-core-1.18.0.jar:1.18.0]
       at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.access$600(ClientCallImpl.java:467) ~[io.grpc-grpc-core-1.18.0.jar:1.18.0]
       at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:584) ~[io.grpc-grpc-core-1.18.0.jar:1.18.0]
       at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37) ~[io.grpc-grpc-core-1.18.0.jar:1.18.0]
       at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123) ~[io.grpc-grpc-core-1.18.0.jar:1.18.0]
       at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_222]
       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_222]
       ... 1 more
   Caused by: io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: /127.0.0.1:46365
       at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[?:1.8.0_222]
       at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) ~[?:1.8.0_222]
       at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:327) ~[io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final]
       at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:340) ~[io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final]
       at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:644) ~[io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final]
       at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:591) ~[io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final]
       at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:508) ~[io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final]
       at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:470) ~[io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final]
       at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:909) ~[io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final]
       ... 2 more
   Caused by: java.net.ConnectException: Connection refused
       at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[?:1.8.0_222]
       at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) ~[?:1.8.0_222]
       at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:327) ~[io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final]
       at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:340) ~[io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final]
       at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:644) ~[io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final]
       at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:591) ~[io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final]
       at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:508) ~[io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final]
       at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:470) ~[io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final]
       at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:909) ~[io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final]
       ... 2 more
   12:45:03.864 [function-timer-thread-3-1] ERROR org.apache.pulsar.functions.runtime.ProcessRuntime - Extracted Process death exception
   java.lang.RuntimeException:
       at org.apache.pulsar.functions.runtime.ProcessRuntime.tryExtractingDeathException(ProcessRuntime.java:380) ~[org.apache.pulsar-pulsar-functions-runtime-2.4.1.jar:2.4.1]
       at org.apache.pulsar.functions.runtime.ProcessRuntime.isAlive(ProcessRuntime.java:367) ~[org.apache.pulsar-pulsar-functions-runtime-2.4.1.jar:2.4.1]
       at org.apache.pulsar.functions.runtime.RuntimeSpawner.lambda$start$0(RuntimeSpawner.java:88) ~[org.apache.pulsar-pulsar-functions-runtime-2.4.1.jar:2.4.1]
       at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_222]
       at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [?:1.8.0_222]
       at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_222]
       at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [?:1.8.0_222]
       at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_222]
       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_222]
       at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final]
       at java.lang.Thread.run(Thread.java:748) [?:1.8.0_222]
   12:45:03.866 [function-timer-thread-3-1] ERROR org.apache.pulsar.functions.runtime.RuntimeSpawner - public/default/womble2-java.lang.RuntimeException:  Function Container is dead with exception.. restarting
   12:45:03.872 [function-timer-thread-3-1] INFO  org.apache.pulsar.functions.runtime.ProcessRuntime - Creating function log directory /home/ubuntu/apache-pulsar-2.4.1/logs/functions/public/default/womble2
   12:45:03.874 [function-timer-thread-3-1] INFO  org.apache.pulsar.functions.runtime.ProcessRuntime - Created or found function log directory /home/ubuntu/apache-pulsar-2.4.1/logs/functions/public/default/womble2
   12:45:03.875 [function-timer-thread-3-1] INFO  org.apache.pulsar.functions.runtime.ProcessRuntime - ProcessBuilder starting the process with args python /home/ubuntu/apache-pulsar-2.4.1/instances/python-instance/python_instance_main.py --py /home/ubuntu/func1.py --logging_directory /home/ubuntu/apache-pulsar-2.4.1/logs/functions --logging_file womble2 --logging_config_file /home/ubuntu/apache-pulsar-2.4.1/conf/functions-logging/logging_config.ini --instance_id 0 --function_id a776f9d5-db5e-4b59-9b01-ab7661ea0df2 --function_version 48bb96e4-69a9-43f9-b880-7ae89d41b0e3 --function_details '{"tenant":"public","namespace":"default","name":"womble2","className":"FirstFunction","logTopic":"pulsar-log","runtime":"PYTHON","autoAck":true,"parallelism":1,"source":{"inputSpecs":{"my-topic":{}},"cleanupSubscription":true},"sink":{},"resources":{"cpu":1.0,"ram":"1073741824","disk":"10737418240"},"componentType":"FUNCTION"}' --pulsar_serviceurl pulsar://localhost:6650 --use_tls false --tls_allow_insecure false --hostname_verification_enabled false --max_buffered_tuples 1024 --port 46365 --metrics_port 44849 --expected_healthcheck_interval 30 --secrets_provider secretsprovider.ClearTextSecretsProvider --cluster_name local
   12:45:03.886 [function-timer-thread-3-1] INFO  org.apache.pulsar.functions.runtime.ProcessRuntime - Started process successfully
   12:45:03.912 [Timer-0] INFO  org.apache.pulsar.functions.LocalRunner - {
     "failureException": "UNAVAILABLE: io exception",
     "instanceId": "0"
   }
   2019-10-31 12:45:04.859 INFO  ConnectionPool:72 | Created connection for pulsar://localhost:6650
   2019-10-31 12:45:04.867 INFO  ClientConnection:324 | [127.0.0.1:52910 -> 127.0.0.1:6650] Connected to broker
   2019-10-31 12:45:04.878 INFO  BatchMessageContainer:43 | { BatchContainer [size = 0] [batchSizeInBytes_ = 0] [maxAllowedMessageBatchSizeInBytes_ = 131072] [maxAllowedNumMessagesInBatch_ = 1000] [topicName = persistent://public/default/pulsar-log] [producerName_ = ] [batchSizeInBytes_ = 0] [numberOfBatchesSent = 0] [averageBatchSize = 0]} BatchMessageContainer constructed
   2019-10-31 12:45:04.878 INFO  HandlerBase:52 | [persistent://public/default/pulsar-log, ] Getting connection from pool
   2019-10-31 12:45:04.887 INFO  ConnectionPool:72 | Created connection for pulsar://ldex-pulsar.int.example.net:6650
   2019-10-31 12:45:04.887 INFO  ClientConnection:326 | [127.0.0.1:52912 -> 127.0.0.1:6650] Connected to broker through proxy. Logical broker: pulsar://ldex-pulsar.int.example.net:6650
   2019-10-31 12:45:04.890 INFO  ProducerImpl:155 | [persistent://public/default/pulsar-log, ] Created producer on broker [127.0.0.1:52912 -> 127.0.0.1:6650]
   2019-10-31 12:45:04.900 INFO  Client:88 | Subscribing on Topic :my-topic
   2019-10-31 12:45:04.902 INFO  HandlerBase:52 | [persistent://public/default/my-topic, public/default/womble2, 0] Getting connection from pool
   2019-10-31 12:45:04.914 INFO  ConsumerImpl:170 | [persistent://public/default/my-topic, public/default/womble2, 0] Created consumer on broker [127.0.0.1:52912 -> 127.0.0.1:6650]
   ^C12:45:12.279 [Thread-1] INFO  org.apache.pulsar.functions.LocalRunner - Shutting down the localrun runtimeSpawner ...
   ```
   
   Notice the 30 second delay as it was trying to report the exception.
   
   Running localrun under strace, finally I can see the backtrace:
   
   ```
   $ strace -f -s128 apache-pulsar-2.4.1/bin/pulsar-admin functions localrun --name womble2 --inputs my-topic --py /home/ubuntu/func1.py --classname FirstFunction --log-topic pulsar-log 2>/tmp/strace.out
   ...
   
   ...
   [pid 17173] read(10, "#!/usr/bin/env python3\n#\n# Licensed to the Apache Software Foundation (ASF) under one\n# or more contributor license agreements. "..., 40
   96) = 4096
   [pid 17173] close(10)                   = 0
   [pid 17173] lseek(9, 0, SEEK_SET)       = 0
   [pid 17173] lseek(9, 0, SEEK_CUR)       = 0
   [pid 17173] read(9, "#!/usr/bin/env python3\n#\n# Licensed to the Apache Software Foundation (ASF) under one\n# or more contributor license agreements. "..., 819
   2) = 8192
   [pid 17173] read(9, "Function Module %s\" % self.instance_config.function_details.className)\n      raise NameError(\"Could not import User Function Mod"..., 819
   2) = 8192
   [pid 17173] close(9)                    = 0
   [pid 17173] getpid()                    = 17173
   [pid 17173] lseek(3, 0, SEEK_END)       = 16934
   [pid 17173] lseek(3, 0, SEEK_CUR)       = 16934
   [pid 17173] write(3, "[2019-10-31 12:48:33 +0000] [ERROR] log.py: raise NameError(\"Could not import User Function Module %s\" % self.instance_config.fu"..., 154
   ) = 154
   [pid 17173] getpid()                    = 17173
   [pid 17173] lseek(3, 0, SEEK_END)       = 17088
   [pid 17173] lseek(3, 0, SEEK_CUR)       = 17088
   [pid 17173] write(3, "[2019-10-31 12:48:33 +0000] [ERROR] log.py: NameError\n", 54) = 54
   [pid 17173] getpid()                    = 17173
   [pid 17173] lseek(3, 0, SEEK_END)       = 17142
   [pid 17173] lseek(3, 0, SEEK_CUR)       = 17142
   [pid 17173] write(3, "[2019-10-31 12:48:33 +0000] [ERROR] log.py: :\n", 46) = 46
   [pid 17173] getpid()                    = 17173
   [pid 17173] lseek(3, 0, SEEK_END)       = 17188
   [pid 17173] lseek(3, 0, SEEK_CUR)       = 17188
   [pid 17173] write(3, "[2019-10-31 12:48:33 +0000] [ERROR] log.py: Could not import User Function Module FirstFunction\n", 96) = 96
   [pid 17173] close(3)                    = 0
   
   ...
   
   [pid 17174] read(3, "\312\376\272\276\0\0\0004\0[\7\0*\n\0+\0,\n\0\1\0-\n\0\24\0.\7\0/\n\0\5\0000\t\0\23\0001\n\0002\0003\n\0004\0005\n\0002\0006\t\0\23\0007\n\0
   \23\0008\n\0002\0009\t\0:\0;\7\0<\n\0\17\0=\7\0>\n\0\24\0009\7\0?\7\0B\1\0\tcloseLock\1\0\22Ljava/lang/Obje"..., 1711) = 1711
   [pid 17158] write(1, "12:48:33.287 [main] INFO  org.apache.pulsar.functions.LocalRunner - RuntimeSpawner quit because of\n", 99 <unfinished ...>
   ```
   
   **Expected behavior**
   I would expect the stderr which my function generates before it dies, to be captured and displayed in logs somewhere.
   
   **Screenshots**
   See above
   
   **Desktop (please complete the following information):**
    - Ubuntu 18.04 (running in lxd container)
   
   **Additional context**
   The fact that my function doesn't start could be a whole bunch of reasons - `log.py` suggests it's because I'm trying to use log-topic - but that's a separate issue that I need to debug.
   
   The issue here is that the message which might help has not been kept.
   
   I can see there is other information which is being logged but lost.  That error comes from `python_instance.py` which in turn calls `import_class()` from `util.py`
   
   ```
   def import_class(from_path, full_class_name):
     from_path = str(from_path)
     full_class_name = str(full_class_name)
     try:
       return import_class_from_path(from_path, full_class_name)
     except Exception as e:
       our_dir = os.path.dirname(os.path.abspath(inspect.getfile(inspect.currentframe())))
       api_dir = os.path.join(our_dir, PULSAR_API_ROOT, PULSAR_FUNCTIONS_API_ROOT)
       try:
         return import_class_from_path(api_dir, full_class_name)
       except Exception as e:
         Log.info("Failed to import class %s from path %s" % (full_class_name, from_path))
         Log.info(e, exc_info=True)
         return None
   ```
   
   When this function returns None there should be a Log.info().  Looking in the strace output I see the detail there:
   
   ```
   $ grep "Failed to import class" /tmp/strace.out
   [pid 17173] write(3, "[2019-10-31 12:48:33 +0000] [INFO] util.py: Failed to import class FirstFunction from path /home/ubuntu\n", 104) = 104
   ```
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services