You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@skywalking.apache.org by GitBox <gi...@apache.org> on 2020/12/12 07:23:06 UTC

[GitHub] [skywalking] Frefreak opened a new issue #5997: agent side percentage sampling config option

Frefreak opened a new issue #5997:
URL: https://github.com/apache/skywalking/issues/5997


   Please answer these questions before submitting your issue.
   
   - Why do you submit this issue?
   - [X] Question or discussion
   - [ ] Bug
   - [ ] Requirement
   - [ ] Feature or performance improvement
   
   ___
   ### Question
   - What do you want to know?
   For sampling, it seems currently (v8.2.0/v8.3.0) there are primarily 2 ways to control trace sampling. One is in server side using `sampleRate` but the bandwidth is still wasted and may not be very flexible if has many multiple sources of agent. The other is via agent side `agent.sample_n_per_3_secs` option but the use case is kind of counter-intuitive. In many scenarios one may want a percentage sampling rate option in agent side. What's the reason of not providing that but using a "traces in 3 seconds" mechanism instead? This looks a bit odd and from looking at the `SamplingService` code a percentage way would not be difficult to implement.   
   ___
   ### Bug
   - Which version of SkyWalking, OS and JRE?
   v8.2.0
   openjdk version "1.8.0_265"
   
   ___
   ### Requirement or improvement
   - provide an agent config similar to `agent.sample_n_per_3_secs` but in percentage (like 10 in every 10000). We can try to implement this and open a PR if that's OK.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [skywalking] wu-sheng commented on issue #5997: agent side percentage sampling config option

Posted by GitBox <gi...@apache.org>.
wu-sheng commented on issue #5997:
URL: https://github.com/apache/skywalking/issues/5997#issuecomment-743725407


   > My previous understanding is that if I only set sampleRate for server side, the headers will still be added for every message and stored in Kafka brokers (kafka storage and the traffic for sending it to brokers are not affected by this option, so the overhead may be un-negligible under a high amount of throughput). That's why I want a "client side" sampling. I'm not very sure of this behavior so please correct me if this is not the fact.
   
   First of all, the so-called service-side `samplingRate` has nothing about transportation, no matter for gRPC or Kafka. This is not related.
   
   > The sample n in 3 seconds client config works for us, but a percentage one would be easier to reason about when specifying this option since different topics may have different messages per seconds, in which case a percentage sampling rate seems more suitable
   
   I think you have a misunderstanding about the sampling. It is not just for the network, sampling at the client side is for reducing the whole agent load, including tracing context, header injection/extraction, span creation/operations, etc. are all not working if not sampling. 
   Kafka reporter is just a pluggable and optional thing. No core-level mechanism related to it. 
   
   ___
   The key of all these things, SkyWalking is targeting the APM, rather than just tracing. So, metrics and topology really matter. Any sampling mechanism at the client side would have a side-effect on this. We can't have a clear scenario about sampling rate(client-side) is better than the current one.
   
   Our final agenda is using `on building` [SkyWalking Satellite](https://github.com/apache/skywalking-satellite), which could be deployed as a sidecar, so no network cost. Then it will analyze the traces to get metrics(we call those **sources**) and forward to the OAP.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [skywalking] wu-sheng closed issue #5997: agent side percentage sampling config option

Posted by GitBox <gi...@apache.org>.
wu-sheng closed issue #5997:
URL: https://github.com/apache/skywalking/issues/5997


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [skywalking] pierre94 commented on issue #5997: agent side percentage sampling config option

Posted by GitBox <gi...@apache.org>.
pierre94 commented on issue #5997:
URL: https://github.com/apache/skywalking/issues/5997#issuecomment-743728541


   I also have a question about this configuration `agent.sample_n_per_3_secs`:  why sample per 3 secs,not 1 or 4?  Is there any special significance?
   
   
   
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [skywalking] wu-sheng commented on issue #5997: agent side percentage sampling config option

Posted by GitBox <gi...@apache.org>.
wu-sheng commented on issue #5997:
URL: https://github.com/apache/skywalking/issues/5997#issuecomment-743719266


   Hi, thanks for asking. The key is why do you need client-side `sampleRate`. I need to correct you 
   
   > but the bandwidth is still wasted
   
   This conclusion is not correct. The metrics of services/instance/etc. are depending on it. The sampling is just reducing the storage load, but the metrics are still accurate.
   
   > may not be very flexible if has many multiple sources of agent
   
   I am not sure what do you mean about this. What kind of flexibility?
   
   > What's the reason of not providing that but using a "traces in 3 seconds" mechanism instead
   
   I could ask the same question revertly, what is the reason for requiring rate sampling?
   
   ___
   The key of the new feature is the scenario, what it could bring more.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [skywalking] wu-sheng commented on issue #5997: agent side percentage sampling config option

Posted by GitBox <gi...@apache.org>.
wu-sheng commented on issue #5997:
URL: https://github.com/apache/skywalking/issues/5997#issuecomment-743729368


   > I also have a question about this configuration agent.sample_n_per_3_secs: why sample per 3 secs,not 1 or 4? Is there any special significance?
   
   2 directions,
   1. If the reset period is too short, it always samples all requests, then there is no point.
   1. If it is too large, then only the first several seconds of N have sampled values.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [skywalking] Frefreak commented on issue #5997: agent side percentage sampling config option

Posted by GitBox <gi...@apache.org>.
Frefreak commented on issue #5997:
URL: https://github.com/apache/skywalking/issues/5997#issuecomment-743723779


   Hi, thanks for the timely response. One of our use case is with MQ like Kafka. 
   My previous understanding is that if I only set `sampleRate` for server side, the headers will still be added for every message and stored in Kafka brokers (kafka storage and the traffic for sending it to brokers are not affected by this option, so the overhead may be un-negligible under a high amount of throughput). That's why I want a "client side" sampling. I'm not very sure of this behavior so please correct me if this is not the fact.
   
   The sample n in 3 seconds client config works for us, but a percentage one would be easier to  reason about when specifying this option since different topics may have different  messages per seconds, in which case a percentage sampling rate seems more suitable.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [skywalking] wu-sheng commented on issue #5997: agent side percentage sampling config option

Posted by GitBox <gi...@apache.org>.
wu-sheng commented on issue #5997:
URL: https://github.com/apache/skywalking/issues/5997#issuecomment-743729015


   > but we feels like a "percentage sampling" similar to that would also be nice to have. 
   
   Why it is nice? 
   
   > At least it would be useful in a multi-service scenario.
   
   How useful? Do you know sampling rate is never accurate due to make sure the trace needs to continue(have a context in the header) despite `not sampling = true`


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [skywalking] Frefreak commented on issue #5997: agent side percentage sampling config option

Posted by GitBox <gi...@apache.org>.
Frefreak commented on issue #5997:
URL: https://github.com/apache/skywalking/issues/5997#issuecomment-743728198


   > First of all, the so-called service-side samplingRate has nothing about transportation, no matter for gRPC or Kafka. This is not related.
   Thanks, this is clear now. We have understood that server side samplingRate is not what we want.
   
   Let's focus on the client side. The already existing option `agent.sample_n_per_3_secs` is exactly what we want for sampling, but we feels like a "percentage sampling" similar to that would also be nice to have. At least it would be useful in a multi-service scenario. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org