You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@skywalking.apache.org by GitBox <gi...@apache.org> on 2022/07/28 05:26:56 UTC
[GitHub] [skywalking] LL1024LL opened a new issue, #9397: [Bug] [Kong]report trace info not working
LL1024LL opened a new issue, #9397:
URL: https://github.com/apache/skywalking/issues/9397
### Search before asking
- [X] I had searched in the [issues](https://github.com/apache/skywalking/issues?q=is%3Aissue) and found no similar issues.
### Apache SkyWalking Component
Kong Agent (apache/skywalking-kong)
### What happened
When I use skyalking-kong:0.2.0, I find reporting trace to OAP process randomly not working. I readed the code and did some log for find why, and I noticed this code block may be a bug
[https://github.com/apache/skywalking-nginx-lua/blob/master/lib/skywalking/client.lua#L63](https://github.com/apache/skywalking-nginx-lua/blob/master/lib/skywalking/client.lua#L63)
```
if 0 == ngx.worker.id() then
local ok, err = new_timer(self.backendTimerDelay, check)
if not ok then
log(ERR, "failed to create timer: ", err)
return
end
end
````
I added some debug log, and I finded when init worker is worker-0, everything goes fine.
![Snipaste_2022-07-28_13-10-08](https://user-images.githubusercontent.com/34938683/181425950-a44ce496-8c62-46e0-9f95-7fd7460ebb1c.png)
But this is not always worker-0 to do the init work. For test, I have modified SEGMENT_BATCH_COUNT
[https://github.com/apache/skywalking-nginx-lua/blob/master/lib/skywalking/client.lua#L21](https://github.com/apache/skywalking-nginx-lua/blob/master/lib/skywalking/client.lua#L21)
```
local SEGMENT_BATCH_COUNT = 5
```
![Snipaste_2022-07-28_13-14-40](https://user-images.githubusercontent.com/34938683/181426066-ecdda0af-a12e-44be-a202-ce6fafa0e7df.png)
like informations in picture, buffer size have been lagger than SEGMENT_BATCH_COUNT(5), bu trace report not happens, trace info still in the memory queue . I think it is because report timer has not been created since init worker-id is 9 not 0.
### What you expected to happen
kong-skywalking plugin works fine even init worker-id is not 0.
### How to reproduce
1. Enable kong-skywalking plugin.
2. Curl batch proxy requests through kong, lager than SEGMENT_BATCH_COUNT
3. When init worker id is not 0, no trace infomation will send to OAP server.
### Anything else
_No response_
### Are you willing to submit PR?
- [X] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [skywalking] wu-sheng closed issue #9397: [Bug] [Kong]report trace info not working
Posted by GitBox <gi...@apache.org>.
wu-sheng closed issue #9397: [Bug] [Kong]report trace info not working
URL: https://github.com/apache/skywalking/issues/9397
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [skywalking] LL1024LL commented on issue #9397: [Bug] [Kong]report trace info not working
Posted by GitBox <gi...@apache.org>.
LL1024LL commented on issue #9397:
URL: https://github.com/apache/skywalking/issues/9397#issuecomment-1197699534
> First of all, you should know, all codes have be included and tested through e2e
>
> 1. Nginx LUA proj, https://github.com/apache/skywalking-nginx-lua/blob/master/.github/workflows/e2e.yaml
> 2. Kong proj, https://github.com/apache/skywalking-kong/blob/master/.github/workflows/e2e.yml
>
> We really run the plugin in the service, and verified the segments it reported.
>
> I am not the one writing all these codes, but I think it is better if you could find out what is the difference between your case and our e2e tests.
Yes I konw, I read test file before to find some information . I notice the test conf has do some init work in nginx.conf, including call startBackendTimer function.
[https://github.com/apache/skywalking-nginx-lua/blob/master/test/e2e/nginx/conf.d/nginx.conf#L44](https://github.com/apache/skywalking-nginx-lua/blob/master/test/e2e/nginx/conf.d/nginx.conf#L44)
```
init_worker_by_lua_block {
local metadata_buffer = ngx.shared.tracing_buffer
metadata_buffer:set("serviceName", "skywalking-nginx")
-- Instance means the number of Nginx deployment, does not mean the worker instances
metadata_buffer:set("serviceInstanceName", "e2e")
-- set ignoreSuffix
require("skywalking.util").set_ignore_suffix(".jpg,.jpeg,.js,.css,.png,.bmp,.gif,.ico,.mp3,.mp4,.svg")
require("skywalking.util").set_randomseed()
require("skywalking.client"):startBackendTimer("http://172.16.238.10:12800")
-- If there is a bug of this `tablepool` implementation, we can
-- disable it in this way
-- require("skywalking.util").disable_tablepool()
skywalking_tracer = require("skywalking.tracer")
}
```
I think this is why test case can pass? Just my guess.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [skywalking] wu-sheng commented on issue #9397: [Bug] [Kong]report trace info not working
Posted by GitBox <gi...@apache.org>.
wu-sheng commented on issue #9397:
URL: https://github.com/apache/skywalking/issues/9397#issuecomment-1197685445
This is not using ID=0 to initial, we expect ID=0 to report the trace. We have to pick one, otherwise, the trace would be reported repeatedly or face race conditions.
My question would be, why don't you have a ID=0 worker?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [skywalking] wu-sheng commented on issue #9397: [Bug] [Kong]report trace info not working
Posted by GitBox <gi...@apache.org>.
wu-sheng commented on issue #9397:
URL: https://github.com/apache/skywalking/issues/9397#issuecomment-1197696295
I don't think the whole code flow has issues in a general way, otherwise, ppl would report this a long time ago.
I think some settings and special use cases/versions, or something, make things different.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [skywalking] wu-sheng commented on issue #9397: [Bug] [Kong]report trace info not working
Posted by GitBox <gi...@apache.org>.
wu-sheng commented on issue #9397:
URL: https://github.com/apache/skywalking/issues/9397#issuecomment-1197709630
Yes, it is needed as documented.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [skywalking] wu-sheng commented on issue #9397: [Bug] [Kong]report trace info not working
Posted by GitBox <gi...@apache.org>.
wu-sheng commented on issue #9397:
URL: https://github.com/apache/skywalking/issues/9397#issuecomment-1197695657
First of all, you should know, all codes have be included and tested through e2e
1. Nginx LUA proj, https://github.com/apache/skywalking-nginx-lua/blob/master/.github/workflows/e2e.yaml
2. Kong proj, https://github.com/apache/skywalking-kong/blob/master/.github/workflows/e2e.yml
We really run the plugin in the service, and verified the segments it reported.
I am not the one writing all these codes, but I think it is better if you could find out what is the difference between your case and our e2e tests.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [skywalking] LL1024LL commented on issue #9397: [Bug] [Kong]report trace info not working
Posted by GitBox <gi...@apache.org>.
LL1024LL commented on issue #9397:
URL: https://github.com/apache/skywalking/issues/9397#issuecomment-1197691281
> This is not using ID=0 to initial, we expect ID=0 to report the trace. We have to pick one, otherwise, the trace would be reported repeatedly or face race conditions.
>
> My question would be, why don't you have a ID=0 worker?
Thank you for your reply !
I read the report process flow , and I think init work happends there.
[https://github.com/apache/skywalking-kong/blob/master/kong/plugins/skywalking/handler.lua#L44](https://github.com/apache/skywalking-kong/blob/master/kong/plugins/skywalking/handler.lua#L44)
```
if config.sample_ratio == 100 or math.random() * 100 < config.sample_ratio then
kong.ctx.plugin.skywalking_sample = true
if not client:isInitialized() then
local metadata_buffer = ngx.shared.tracing_buffer
metadata_buffer:set('serviceName', config.service_name)
metadata_buffer:set('serviceInstanceName', config.service_instance_name)
metadata_buffer:set('includeHostInEntrySpan', config.include_host_in_entry_span)
client:startBackendTimer(config.backend_http_uri)
end
tracer:start(self:get_remote_peer(ngx.ctx.balancer_data))
end
```
And when first worker(I think it is randomly) reach here, it will do initial work including call client:startBackendTimer.
But in client:startBackendTimer function
[https://github.com/apache/skywalking-nginx-lua/blob/master/lib/skywalking/client.lua#L33](https://github.com/apache/skywalking-nginx-lua/blob/master/lib/skywalking/client.lua#L33)
```
function Client:startBackendTimer(backend_http_uri)
initialized = true
```
The first thing the function do is set initialized = true.
So other worker will not enter Client:startBackendTimer function , including work-id = 0.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org