You are viewing a plain text version of this content. The canonical link for it is here.

Posted to notifications@skywalking.apache.org by GitBox <gi...@apache.org> on 2021/04/26 06:59:20 UTC

[GitHub] [skywalking] honganan opened a new issue #6836: Questions about Agent Sampling logic

honganan opened a new issue #6836:
URL: https://github.com/apache/skywalking/issues/6836


   I have some questions about the agent sampling logic when I'm reading the code of `SamplingService.java` today:
   
   1. I don't see the effect of `forceSampled()` method. It just increase the `samplingFactorHolder`, but the vacancy space it left can not be used by current request. The only effect it seems just making the sampling result count decrease 1 in current period, And the current request still need to compete for sampling in high concurrency scenario, it seems not so helpful for current request's sampling.
   2. The `trySampling` method use CAS, In extreme situation, It may result in some requests be ignored but sum of  the sampled requests even less than the threshold set by `SAMPLE_N_PER_3_SECS` in this period. Will it?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [skywalking] wu-sheng commented on issue #6836: Questions about Agent Sampling logic

Posted by GitBox <gi...@apache.org>.

wu-sheng commented on issue #6836:
URL: https://github.com/apache/skywalking/issues/6836#issuecomment-826686343


   Back to the codes, `#forceSampled` is for making sure the trace continues when context exists and propagated from downstream. And you should know, we never declared `SAMPLE_N_PER_3_SECS` that N is an accurate number and must be kept. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [skywalking] LiWenGu commented on issue #6836: Questions about Agent Sampling logic

Posted by GitBox <gi...@apache.org>.

LiWenGu commented on issue #6836:
URL: https://github.com/apache/skywalking/issues/6836#issuecomment-848765527


   > Let me give you the conclusion first, the feedback chain would work. Think about how MQ and many async mechanisms work today in the distributed system. This is a bad mislead to the users saying, we could only sample the wrong request.
   
   I don't quite understand the feedback chain implementation mechanism here.
   But when I tried to implement 100% error trace collection in SkyWalking, there was a problem:
   For example, A-> B-> C, when C does not return, because A and B do not know whether the trace is wrong, A and B need to keep the TracingContext information in memory all the time, which affects the machine memory. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [skywalking] wu-sheng closed issue #6836: Questions about Agent Sampling logic

Posted by GitBox <gi...@apache.org>.

wu-sheng closed issue #6836:
URL: https://github.com/apache/skywalking/issues/6836


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [skywalking] wu-sheng commented on issue #6836: Questions about Agent Sampling logic

Posted by GitBox <gi...@apache.org>.

wu-sheng commented on issue #6836:
URL: https://github.com/apache/skywalking/issues/6836#issuecomment-848778255


   > But when I tried to implement 100% error trace collection in SkyWalking, there was a problem:
   For example, A-> B-> C, when C does not return, because A and B do not know whether the trace is wrong, A and B need to keep the TracingContext information in memory all the time, which affects the machine memory.
   
   We never said this should be done. Do you misunderstand something?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [skywalking] wu-sheng commented on issue #6836: Questions about Agent Sampling logic

Posted by GitBox <gi...@apache.org>.

wu-sheng commented on issue #6836:
URL: https://github.com/apache/skywalking/issues/6836#issuecomment-826684076


   And you are still seeing SkyWalking like it was 4 years ago, now, traces are affecting the topology and metrics analysis. All client sampling is recommended only when it has to be, due to performance impact. Otherwise, don't do client sampling.
   Right now, in 2020 and 2021, k8s/micro-service/distributed-service is the trend and reality. Purely using tracing like enhanced logs would not be recommended. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [skywalking] wu-sheng commented on issue #6836: Questions about Agent Sampling logic

Posted by GitBox <gi...@apache.org>.

wu-sheng commented on issue #6836:
URL: https://github.com/apache/skywalking/issues/6836#issuecomment-848798879


   > I'm just discussing how to implement this feature in skywalking if possible
   
   From my understanding, if you consider MQ and async processes, this never works. So, this is a limited feature.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [skywalking] wu-sheng commented on issue #6836: Questions about Agent Sampling logic

Posted by GitBox <gi...@apache.org>.

wu-sheng commented on issue #6836:
URL: https://github.com/apache/skywalking/issues/6836#issuecomment-826682300


   Let me give you the conclusion first, the feedback chain would work. Think about how MQ and many async mechanisms work today in the distributed system. This is a bad mislead to the users saying, we could only sample the wrong request.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [skywalking] LiWenGu commented on issue #6836: Questions about Agent Sampling logic

Posted by GitBox <gi...@apache.org>.

LiWenGu commented on issue #6836:
URL: https://github.com/apache/skywalking/issues/6836#issuecomment-848742576


   We also want to have this feature, but we are not to reduce storage load, but for two purposes:
   1. Set a lower sampling rate to reduce the impact on client machine performance
   2. When troubleshooting online problems, no abnormal trace will be missed
   
   I see that this issue has joined the 8.6.0 milestone, but has been closed again. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [skywalking] LiWenGu commented on issue #6836: Questions about Agent Sampling logic

Posted by GitBox <gi...@apache.org>.

LiWenGu commented on issue #6836:
URL: https://github.com/apache/skywalking/issues/6836#issuecomment-848905889


   I have learned that if this feature is to be supported, these losses are inevitable in this case. Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [skywalking] LiWenGu commented on issue #6836: Questions about Agent Sampling logic

Posted by GitBox <gi...@apache.org>.

LiWenGu commented on issue #6836:
URL: https://github.com/apache/skywalking/issues/6836#issuecomment-848826571


   Yeah! When the downstream is an asynchronous framework, the upstream service cannot know whether the downstream business logic processing is really successful. 
   I don’t know how to deal with this scene for the time being.
   But when the all trace is called synchronously, there will also be the memory problems I mentioned.
   error trace will be 100% recorded (only sync called).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [skywalking] wu-sheng commented on issue #6836: Questions about Agent Sampling logic

Posted by GitBox <gi...@apache.org>.

wu-sheng commented on issue #6836:
URL: https://github.com/apache/skywalking/issues/6836#issuecomment-826688529


   In the end, I think you are walking on the wrong road. If you want to reduce the storage load, try to activate sampling at the backend, but metrics should be accurate and consistent.
   If no topology and metrics, actually, you don't need the APM. Trace records are not SkyWalking design goal, APM is. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [skywalking] LiWenGu commented on issue #6836: Questions about Agent Sampling logic

Posted by GitBox <gi...@apache.org>.

LiWenGu commented on issue #6836:
URL: https://github.com/apache/skywalking/issues/6836#issuecomment-848797511


   I'm just discussing how to implement this feature in skywalking if possible


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [skywalking] wu-sheng commented on issue #6836: Questions about Agent Sampling logic

Posted by GitBox <gi...@apache.org>.

wu-sheng commented on issue #6836:
URL: https://github.com/apache/skywalking/issues/6836#issuecomment-848747354


   > We also want to have this feature, but we are not to reduce storage load, but for two purposes:
   > 
   > 1. Set a lower sampling rate to reduce the impact on client machine performance
   > 2. When troubleshooting online problems, no abnormal trace will be missed
   > 
   > I see that this issue has joined the 8.6.0 milestone, but has been closed again.
   
   We have explained the reasons. Unless you have something to fix our concerns, there is no further action to be taken.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [skywalking] wu-sheng commented on issue #6836: Questions about Agent Sampling logic

Posted by GitBox <gi...@apache.org>.

wu-sheng commented on issue #6836:
URL: https://github.com/apache/skywalking/issues/6836#issuecomment-848859776


   > But when the all trace is called synchronously, there will also be the memory problems I mentioned.
   
   You definitely have to cost as much as 100% sampling, as you never know what is the case. That is also we don't believe this could work efficiently. So, in the design level, **SkyWalking trust we make sure 100% sampling working on the agent side**.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org