You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@skywalking.apache.org by GitBox <gi...@apache.org> on 2022/07/28 05:26:56 UTC

[GitHub] [skywalking] LL1024LL opened a new issue, #9397: [Bug] [Kong]report trace info not working

LL1024LL opened a new issue, #9397:
URL: https://github.com/apache/skywalking/issues/9397

   ### Search before asking
   
   - [X] I had searched in the [issues](https://github.com/apache/skywalking/issues?q=is%3Aissue) and found no similar issues.
   
   
   ### Apache SkyWalking Component
   
   Kong Agent (apache/skywalking-kong)
   
   ### What happened
   
   When I use skyalking-kong:0.2.0, I find reporting trace to OAP process randomly not working. I readed the code and did some log for find why, and I noticed this code block may be a bug
   
   [https://github.com/apache/skywalking-nginx-lua/blob/master/lib/skywalking/client.lua#L63](https://github.com/apache/skywalking-nginx-lua/blob/master/lib/skywalking/client.lua#L63)
   ```
       if 0 == ngx.worker.id() then
           local ok, err = new_timer(self.backendTimerDelay, check)
           if not ok then
               log(ERR, "failed to create timer: ", err)
               return
           end
       end
   ````
   
   I added  some debug log, and I finded when init worker is worker-0, everything goes fine. 
   ![Snipaste_2022-07-28_13-10-08](https://user-images.githubusercontent.com/34938683/181425950-a44ce496-8c62-46e0-9f95-7fd7460ebb1c.png)
   
   But this is not always worker-0 to do the init work. For test, I have modified SEGMENT_BATCH_COUNT
   [https://github.com/apache/skywalking-nginx-lua/blob/master/lib/skywalking/client.lua#L21](https://github.com/apache/skywalking-nginx-lua/blob/master/lib/skywalking/client.lua#L21)
   ```
   local SEGMENT_BATCH_COUNT = 5
   ```
   ![Snipaste_2022-07-28_13-14-40](https://user-images.githubusercontent.com/34938683/181426066-ecdda0af-a12e-44be-a202-ce6fafa0e7df.png)
   
   like informations in picture, buffer size have been lagger than SEGMENT_BATCH_COUNT(5), bu trace report not happens, trace info still in the memory queue . I think it is because report timer has not been created since init worker-id is 9 not 0.
   
   
   ### What you expected to happen
   
   kong-skywalking plugin works fine even init worker-id is not 0.
   
   ### How to reproduce
   
   1. Enable kong-skywalking plugin.
   2. Curl batch proxy requests through kong, lager than SEGMENT_BATCH_COUNT
   3. When init worker id is not 0, no trace infomation will send to OAP server.
   
   ### Anything else
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng closed issue #9397: [Bug] [Kong]report trace info not working

Posted by GitBox <gi...@apache.org>.
wu-sheng closed issue #9397: [Bug] [Kong]report trace info not working
URL: https://github.com/apache/skywalking/issues/9397


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] LL1024LL commented on issue #9397: [Bug] [Kong]report trace info not working

Posted by GitBox <gi...@apache.org>.
LL1024LL commented on issue #9397:
URL: https://github.com/apache/skywalking/issues/9397#issuecomment-1197699534

   > First of all, you should know, all codes have be included and tested through e2e
   > 
   > 1. Nginx LUA proj, https://github.com/apache/skywalking-nginx-lua/blob/master/.github/workflows/e2e.yaml
   > 2. Kong proj, https://github.com/apache/skywalking-kong/blob/master/.github/workflows/e2e.yml
   > 
   > We really run the plugin in the service, and verified the segments it reported.
   > 
   > I am not the one writing all these codes, but I think it is better if you could find out what is the difference between your case and our e2e tests.
   
   Yes I konw, I read test file before to find some information .  I notice  the test conf has do some init work in nginx.conf, including call startBackendTimer function.
   [https://github.com/apache/skywalking-nginx-lua/blob/master/test/e2e/nginx/conf.d/nginx.conf#L44](https://github.com/apache/skywalking-nginx-lua/blob/master/test/e2e/nginx/conf.d/nginx.conf#L44)
   ```
    init_worker_by_lua_block {
           local metadata_buffer = ngx.shared.tracing_buffer
   
           metadata_buffer:set("serviceName", "skywalking-nginx")
           -- Instance means the number of Nginx deployment, does not mean the worker instances
           metadata_buffer:set("serviceInstanceName", "e2e")
            -- set ignoreSuffix
           require("skywalking.util").set_ignore_suffix(".jpg,.jpeg,.js,.css,.png,.bmp,.gif,.ico,.mp3,.mp4,.svg")
   
           require("skywalking.util").set_randomseed()
           require("skywalking.client"):startBackendTimer("http://172.16.238.10:12800")
   
           -- If there is a bug of this `tablepool` implementation, we can
           -- disable it in this way
           -- require("skywalking.util").disable_tablepool()
   
           skywalking_tracer = require("skywalking.tracer")
       }
   ```
   I think this is why test case can pass? Just my guess. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng commented on issue #9397: [Bug] [Kong]report trace info not working

Posted by GitBox <gi...@apache.org>.
wu-sheng commented on issue #9397:
URL: https://github.com/apache/skywalking/issues/9397#issuecomment-1197685445

   This is not using ID=0 to initial, we expect ID=0 to report the trace. We have to pick one, otherwise, the trace would be reported repeatedly or face race conditions.
   
   My question would be, why don't you have a ID=0 worker?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng commented on issue #9397: [Bug] [Kong]report trace info not working

Posted by GitBox <gi...@apache.org>.
wu-sheng commented on issue #9397:
URL: https://github.com/apache/skywalking/issues/9397#issuecomment-1197696295

   I don't think the whole code flow has issues in a general way, otherwise, ppl would report this a long time ago.
   I think some settings and special use cases/versions, or something, make things different.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng commented on issue #9397: [Bug] [Kong]report trace info not working

Posted by GitBox <gi...@apache.org>.
wu-sheng commented on issue #9397:
URL: https://github.com/apache/skywalking/issues/9397#issuecomment-1197709630

   Yes, it is needed as documented.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng commented on issue #9397: [Bug] [Kong]report trace info not working

Posted by GitBox <gi...@apache.org>.
wu-sheng commented on issue #9397:
URL: https://github.com/apache/skywalking/issues/9397#issuecomment-1197695657

   First of all, you should know, all codes have be included and tested through e2e
   1. Nginx LUA proj, https://github.com/apache/skywalking-nginx-lua/blob/master/.github/workflows/e2e.yaml
   2. Kong proj, https://github.com/apache/skywalking-kong/blob/master/.github/workflows/e2e.yml
   
   We really run the plugin in the service, and verified the segments it reported. 
   
   I am not the one writing all these codes, but I think it is better if you could find out what is the difference between your case and our e2e tests.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] LL1024LL commented on issue #9397: [Bug] [Kong]report trace info not working

Posted by GitBox <gi...@apache.org>.
LL1024LL commented on issue #9397:
URL: https://github.com/apache/skywalking/issues/9397#issuecomment-1197691281

   
   
   
   
   > This is not using ID=0 to initial, we expect ID=0 to report the trace. We have to pick one, otherwise, the trace would be reported repeatedly or face race conditions.
   > 
   > My question would be, why don't you have a ID=0 worker?
   
   
   Thank you for your reply !
   I read the report process  flow , and I think init work happends there.
   [https://github.com/apache/skywalking-kong/blob/master/kong/plugins/skywalking/handler.lua#L44](https://github.com/apache/skywalking-kong/blob/master/kong/plugins/skywalking/handler.lua#L44)
   ```
       if config.sample_ratio == 100 or math.random() * 100 < config.sample_ratio then
           kong.ctx.plugin.skywalking_sample = true
   
           if not client:isInitialized() then
               local metadata_buffer = ngx.shared.tracing_buffer
               metadata_buffer:set('serviceName', config.service_name)
               metadata_buffer:set('serviceInstanceName', config.service_instance_name)
               metadata_buffer:set('includeHostInEntrySpan', config.include_host_in_entry_span)
   
               client:startBackendTimer(config.backend_http_uri)
           end
   
           tracer:start(self:get_remote_peer(ngx.ctx.balancer_data))
       end
   ```
   
   And when first worker(I think it is randomly) reach here, it will do initial work including call client:startBackendTimer.
   But in client:startBackendTimer function
   [https://github.com/apache/skywalking-nginx-lua/blob/master/lib/skywalking/client.lua#L33](https://github.com/apache/skywalking-nginx-lua/blob/master/lib/skywalking/client.lua#L33)
   ```
   function Client:startBackendTimer(backend_http_uri)
       initialized = true
   ```
   The first thing the function do is set initialized = true.
   So other worker will not enter Client:startBackendTimer function , including work-id = 0.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org