You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@skywalking.apache.org by GitBox <gi...@apache.org> on 2021/04/30 03:06:44 UTC

[GitHub] [skywalking] Gallop-stark opened a new issue #6876: mesh url level accesslog metrics

Gallop-stark opened a new issue #6876:
URL: https://github.com/apache/skywalking/issues/6876


   Please answer these questions before submitting your issue.
   
   - Why do you submit this issue?
   - [ ] Question or discussion
   - [ ] Bug
   - [ ] Requirement
   - [x] Feature or performance improvement
   ___
   ### Requirement or improvement
   - Please describe your requirements or improvement suggestions.
   
   - About url level accesslog in mesh, if url has parameter, the raw urls number in accesslog will be huge. If recognize the regex rule of raw url, we can push regex rule to mesh sidecar filter. Here is a way to recognize maybe the regex rules of raw urls. 
   
   For example,
   
   Raw urls in accesslog:
   /user/1,/user/2,/user/3,/user/4,/index
   
   we can get this:
   1. "/user" of first position (split by '/') is in 4 url ;
   2. "/1" of second position is in 1 url ;
   3. "/2" of second position is in 1 url ;
   4. "/3" of second position is in 1 url ;
   5. "/4" of second position is in 1 url ;
   6. "/index" of first position is in 1 url ;
   
   "/user" of first position is in 4 url and more than others,so we think "/user" of first position maybe some regex rule immutable part, others are not immutable part.
   
   Next:
   1.  /user/1,  /user is immutable part,not change. /1 is not immutable part and use /[^/]+ to replace,so ^/user/[^/]+$ matched num = 1 ;
   2. /user/2, ..., so ^/user/[^/]+$ matched num = 2 ;
   3. /user/3, ..., so ^/user/[^/]+$ matched num = 3 ;
   4. /user/4, ..., so ^/user/[^/]+$ matched num = 4 ;
   5. /index , /index is not immutable part,use /[^/]+ to replace, so ^/[^/]+$ num = 1 ;
   
   Finally:
   we select the top num matched ^/user/[^/]+$ push to mesh sidecar filter;
   
   And repeat above action;
   
   Like mostly learning algorithms,it is not ensure 100% right and need more data to be more right, but in our company product enviroment ,it can cover 95% case ,and 5% case need user manually set and adjust.I don't know if it helps in skywalking mesh url mitrics.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [skywalking] Gallop-stark removed a comment on issue #6876: mesh url level accesslog metrics

Posted by GitBox <gi...@apache.org>.
Gallop-stark removed a comment on issue #6876:
URL: https://github.com/apache/skywalking/issues/6876#issuecomment-830476994


   Thank you for your reply!
   
   > From the unified metrics, the endpoint grouping mechanism is existing for a while in SkyWalking.You could do this today.
   
   I find this, [docs/en/setup/backend/endpoint-grouping-rules.md](https://github.com/apache/skywalking/blob/83757dae617f3ff5b0a61fc8985b859f28979974/docs/en/setup/backend/endpoint-grouping-rules.md),i do not know whether or not i found right md.
   
   Maybe my enviroment is different from skywalking.
   
   In this md,i know the endpoint grouping mechanism need users add regex pattern manually,but in my environment,many service have many regex patterns, users will feel troubled to add so manay patterns, so this is why i design this learning algorithm.I know real url is better, but in my enviroment, too many real url and too many regex patterns,and all request must be detected.
   
   Yes,languge agent can get real regex pattern in code, but i think it not suitable for mesh gateway detection.There are reason:
   1. Here are java, php, nodejs, python, go, c++ ... in my environment,make every language agent need so many time,and need new agent if add a new language.
   2. I think mesh gateway should detect all input request, the real regex pattern in code obtained by language agent is code expected reqeust, maybe input request has some url request that code not excepted and should be detect by mesh, so mesh gateway detection is different from language agent.
   
   I will be appreciated if you can give me some suggestions.
   
   thanks.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [skywalking] wu-sheng closed issue #6876: mesh url level accesslog metrics

Posted by GitBox <gi...@apache.org>.
wu-sheng closed issue #6876:
URL: https://github.com/apache/skywalking/issues/6876


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [skywalking] wu-sheng commented on issue #6876: mesh url level accesslog metrics

Posted by GitBox <gi...@apache.org>.
wu-sheng commented on issue #6876:
URL: https://github.com/apache/skywalking/issues/6876#issuecomment-830482846


   Sorry, I don't remember the name of the algorithm but used to try one. 
   This is actually a very typical mathematical problem, once you have a data set, should not be very hard.
   But this kind of data process model is a periodic task, such as a timer.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [skywalking] Gallop-stark commented on issue #6876: mesh url level accesslog metrics

Posted by GitBox <gi...@apache.org>.
Gallop-stark commented on issue #6876:
URL: https://github.com/apache/skywalking/issues/6876#issuecomment-830478172


   Thank you for your reply!
   
   > From the unified metrics, the endpoint grouping mechanism is existing for a while in SkyWalking.You could do this today.
   
   I find this, [docs/en/setup/backend/endpoint-grouping-rules.md](https://github.com/apache/skywalking/blob/83757dae617f3ff5b0a61fc8985b859f28979974/docs/en/setup/backend/endpoint-grouping-rules.md),i do not know whether or not i found right md.
   
   Maybe my enviroment is different from skywalking.
   
   In this md,i know the endpoint grouping mechanism need users add regex pattern manually,but in my environment,many services have many regex patterns, users will feel troubled to add so manay patterns, so this is why i design this learning algorithm.I know real url is better, but in my enviroment, too many real url and too many regex patterns,and all request must be detected.
   
   Yes,languge agent can get real regex pattern in code, but i think it not suitable for mesh gateway detection.There are reason:
   1. Here are java, php, nodejs, python, go, c++ ... in my environment,make every language agent need so many time,and need new agent if add a new language.
   2. I think mesh gateway should detect all input request, the real regex pattern in code obtained by language agent is code expected reqeust, maybe input request has some url request that code not excepted and should be detect by mesh, so mesh gateway detection is different from language agent.
   
   I will be appreciated if you can give me some suggestions.
   
   thanks.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [skywalking] wu-sheng commented on issue #6876: mesh url level accesslog metrics

Posted by GitBox <gi...@apache.org>.
wu-sheng commented on issue #6876:
URL: https://github.com/apache/skywalking/issues/6876#issuecomment-829764845


   If you like the algorithm to find the pattern, it exists out there for a long time.
   From the unified metrics, the endpoint grouping mechanism is existing for a while in SkyWalking. You could do this today. From an observer perspective, we care about the real URI, and grouping after we have the access log.
   So, could you be more clear about helping SkyWalking mesh URI metrics?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [skywalking] Gallop-stark commented on issue #6876: mesh url level accesslog metrics

Posted by GitBox <gi...@apache.org>.
Gallop-stark commented on issue #6876:
URL: https://github.com/apache/skywalking/issues/6876#issuecomment-830476994


   Thank you for your reply!
   
   > From the unified metrics, the endpoint grouping mechanism is existing for a while in SkyWalking.You could do this today.
   
   I find this, [docs/en/setup/backend/endpoint-grouping-rules.md](https://github.com/apache/skywalking/blob/83757dae617f3ff5b0a61fc8985b859f28979974/docs/en/setup/backend/endpoint-grouping-rules.md),i do not know whether or not i found right md.
   
   Maybe my enviroment is different from skywalking.
   
   In this md,i know the endpoint grouping mechanism need users add regex pattern manually,but in my environment,many service have many regex patterns, users will feel troubled to add so manay patterns, so this is why i design this learning algorithm.I know real url is better, but in my enviroment, too many real url and too many regex patterns,and all request must be detected.
   
   Yes,languge agent can get real regex pattern in code, but i think it not suitable for mesh gateway detection.There are reason:
   1. Here are java, php, nodejs, python, go, c++ ... in my environment,make every language agent need so many time,and need new agent if add a new language.
   2. I think mesh gateway should detect all input request, the real regex pattern in code obtained by language agent is code expected reqeust, maybe input request has some url request that code not excepted and should be detect by mesh, so mesh gateway detection is different from language agent.
   
   I will be appreciated if you can give me some suggestions.
   
   thanks.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [skywalking] Gallop-stark commented on issue #6876: mesh url level accesslog metrics

Posted by GitBox <gi...@apache.org>.
Gallop-stark commented on issue #6876:
URL: https://github.com/apache/skywalking/issues/6876#issuecomment-830490247


   thanks, i try to do some google with this algorithm :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org