You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2019/11/08 06:57:35 UTC

[GitHub] [pulsar] Sunkwan-Kwon opened a new issue #5589: Memory leak of pulsar-function-go library

Sunkwan-Kwon opened a new issue #5589: Memory leak of pulsar-function-go library
URL: https://github.com/apache/pulsar/issues/5589
 
 
   **Describe the bug**
   It seems that there is a memory leak in the `pulsar-function-go` library.
   
   I implemented a simple pulsar function worker that just write logs using `pulsar-function-go/logutil` for sending logs to log topic. I tried to long-term test by sending request messages consecutively to the input topic to check the feasibility.
   
   At the first time, I set `--log-topic` for the function worker, but I faced a `ProducerQueueIsFull` error after a few seconds later.
   
   After that I didn't set `--log-topic` option to find out the reason. From the second test, there was no more `ProducerQueueIsFull` error, but the memory of the pulsar function worker process had grown indefinitely.
   
   I used `pprof` to pinpoint the root cause of the problem. Please refer to the result below.
   ```
   $ go tool pprof -top http://localhost:6060/debug/pprof/heap
   Fetching profile over HTTP from http://localhost:6060/debug/pprof/heap
   File: simple-worker
   Build ID: bbb7d25540b7a482661b05e4c0da0a5ba2ef7bae
   Type: inuse_space
   Time: Nov 8, 2019 at 3:44pm (KST)
   Showing nodes accounting for 12.51MB, 100% of 12.51MB total
         flat  flat%   sum%        cum   cum%
      12.51MB   100%   100%    12.51MB   100%  github.com/sirupsen/logrus.(*Entry).String
            0     0%   100%    12.51MB   100%  github.com/apache/pulsar/pulsar-function-go/logutil.(*contextHook).Fire
            0     0%   100%        1MB  7.99%  github.com/apache/pulsar/pulsar-function-go/logutil.Error
            0     0%   100%    11.51MB 92.01%  github.com/apache/pulsar/pulsar-function-go/logutil.Infof
            0     0%   100%        1MB  7.99%  github.com/apache/pulsar/pulsar-function-go/pf.(*goInstance).addLogTopicHandler
            0     0%   100%    11.51MB 92.01%  github.com/apache/pulsar/pulsar-function-go/pf.(*goInstance).handlerMsg
            0     0%   100%    12.51MB   100%  github.com/apache/pulsar/pulsar-function-go/pf.(*goInstance).startFunction
            0     0%   100%    12.51MB   100%  github.com/apache/pulsar/pulsar-function-go/pf.Start
            0     0%   100%    11.51MB 92.01%  github.com/apache/pulsar/pulsar-function-go/pf.newFunction.func1
            0     0%   100%    11.51MB 92.01%  github.com/apache/pulsar/pulsar-function-go/pf.pulsarFunction.process
            0     0%   100%    12.51MB   100%  github.com/sirupsen/logrus.(*Entry).Log
            0     0%   100%    11.51MB 92.01%  github.com/sirupsen/logrus.(*Entry).Logf
   
   ...
   ...
   ```
   
   As you could see, a string object for logging had grown during the test.
   
   It seems that `ProducerQueueIsFull` error with `--log-topic` option, and the memory leak without `--log-topic` option is caused by the same reason. So I modified `pulsar-go-function` codes to fix it. Fortunately, It seems has been fixed now. So I will send a pull request for that.
   
   **To Reproduce**
   Steps to reproduce the behavior:
   1. Prepare a pulsar cluster with standalone mode.
   2. Prepare a simple pulsar function worker. Refer to the codes below.
   ```
   $ cat simple-worker.go 
   package main
   
   import (
           "context"
           "fmt"
           "net/http"
           _ "net/http/pprof"
   
           log "github.com/apache/pulsar/pulsar-function-go/logutil"
           "github.com/apache/pulsar/pulsar-function-go/pf"
   )
   
   func main() {
           // go routine for pprof
           go func() {
                   fmt.Println("%+v", http.ListenAndServe("localhost:6060", nil))
           }()
   
           pf.Start(testFunction)
   }
   
   func testFunction(ctx context.Context) {
           for i := 0; i < 10; i++ {
                   log.Infof("This is test of pulsar function. This is test of pulsar function. This is test of pulsar function. This is test of pulsar function. This is test of pulsar function. This is test of pulsar function. This is test of pulsar function. This is test of pulsar function. This is test of pulsar function. This is test of pulsar function. This is test of pulsar function. This is test of pulsar function. This is test of pulsar function. This is test of pulsar function. This is test of pulsar function. This is test of pulsar function. This is test of pulsar function. This is test of pulsar function. This is test of pulsar function. This is test of pulsar function. This is test of pulsar function. This is test of pulsar function. This is test of pulsar function. This is test of pulsar function. This is test of pulsar function. This is test of pulsar function. This is test of pulsar function. This is test of pulsar function. This is test of pulsar function.\n")
           }
   }
   ```
   3. Prepare a conf.yaml file for the function worker.
   ```
   ---
   pulsarServiceURL: "pulsar://sunkwan-devpc:6650"
   instanceID: 0
   funcID: "ed5f48f4-8e76-4d30-8b88-6eea1ff3cbf3"
   funcVersion: "12f361a0-67e5-4404-8177-8c328a33e3db"
   maxBufTuples: 1024
   port: 0
   clusterName: "standalone"
   killAfterIdleMs: 0
   tenant: "public"
   nameSpace: "default"
   name: "func-concat"
   className: ''
   #logTopic: 'persistent://public/default/func-concat-logs'
   logTopic: ''
   processingGuarantees: 0
   secretsMap: ''
   runtime: 0
   autoAck: true
   parallelism: 3
   subscriptionType: 0
   timeoutMs: 0
   subscriptionName: ''
   cleanupSubscription: true
   sourceSpecsTopic: "persistent://public/default/func-concat"
   sourceSchemaType: ''
   receiverQueueSize: 1
   #sinkSpecsTopic: "persistent://public/default/func-concat-output"
   #sinkSpecsTopic: ""
   #sinkSchemaType: ''
   #cpu: 1
   #ram: 1073741824
   #disk: 10737418240
   #maxMessageRetries: 0
   #deadLetterTopic: ''
   #regexPatternSubscription: false
   ```
   4. Execute the simple pulsar function worker by command line below.
   ```
   ./simple-worker --instance-conf-path ./conf.yaml
   ```
   5. Send messages to the input topic consecutively.
   6. Check the memory usage and heap status using pprof.
   ```
   $ go tool pprof -top http://localhost:6060/debug/pprof/heap
   Fetching profile over HTTP from http://localhost:6060/debug/pprof/heap
   Saved profile in /root/pprof/pprof.simple-worker.alloc_objects.alloc_space.inuse_objects.inuse_space.120.pb.gz
   File: simple-worker
   Build ID: bbb7d25540b7a482661b05e4c0da0a5ba2ef7bae
   Type: inuse_space
   Time: Nov 8, 2019 at 3:48pm (KST)
   Showing nodes accounting for 30080.51kB, 100% of 30080.51kB total
         flat  flat%   sum%        cum   cum%
   29211.22kB 97.11% 97.11% 29211.22kB 97.11%  github.com/sirupsen/logrus.(*Entry).String
     869.29kB  2.89%   100% 30080.51kB   100%  github.com/apache/pulsar/pulsar-function-go/logutil.(*contextHook).Fire
            0     0%   100%  1536.21kB  5.11%  github.com/apache/pulsar/pulsar-function-go/logutil.Error
            0     0%   100% 28544.30kB 94.89%  github.com/apache/pulsar/pulsar-function-go/logutil.Infof
            0     0%   100%  1536.21kB  5.11%  github.com/apache/pulsar/pulsar-function-go/pf.(*goInstance).addLogTopicHandler
            0     0%   100% 28544.30kB 94.89%  github.com/apache/pulsar/pulsar-function-go/pf.(*goInstance).handlerMsg
   ....
   
   ```
   
   **Expected behavior**
   - `ProducerQueueIsFull` error shouldn't be occurred.
   - Memory usage of pulsar worker process shouldn't be grown indefinitely.
   
   **Screenshots**
   N/A
   
   **Desktop (please complete the following information):**
    - OS: Ubuntu 18.04
   
   **Additional context**
   N/A

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services