You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@solr.apache.org by "David Smiley (Jira)" <ji...@apache.org> on 2021/03/24 16:30:00 UTC

[jira] [Commented] (SOLR-15283) Remove Solr trace sampling; let Tracer configuration/impl decide

    [ https://issues.apache.org/jira/browse/SOLR-15283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17307984#comment-17307984 ] 

David Smiley commented on SOLR-15283:
-------------------------------------

FYI
Jaeger's sampling docs: https://www.jaegertracing.io/docs/1.22/sampling/
Brave (Zipkin Java lib) sampler code: https://github.com/openzipkin/brave/tree/master/brave/src/main/java/brave/sampler which doesn't have a formal config AFAICT

> Remove Solr trace sampling; let Tracer configuration/impl decide
> ----------------------------------------------------------------
>
>                 Key: SOLR-15283
>                 URL: https://issues.apache.org/jira/browse/SOLR-15283
>             Project: Solr
>          Issue Type: Task
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: David Smiley
>            Assignee: David Smiley
>            Priority: Blocker
>             Fix For: main (9.0)
>
>
> GlobalTracer should always have the Tracer produced by the TracerConfiguratorPlugin.  Solr should not intervene by substituting a no-op version sometimes, and thus needn't have its ThreadLocal tracking either (which doesn't work well).  The special {{samplePercentage}} cluster property should be removed.
> Background: When someone configures tracing (supplying TracerConfigurator plugin), Solr will "sample" tracing if an incoming request has no tracing information.  By default this is 10% and is only configurable via a {{samplePercentage}} cluster property.  If you're in the 90%, this results in a no-op Tracer -- no trace IDs.  This is really confusing & annoying because Tracers themselves have notions of sampling, which means "reporting" (sending) the trace to a  tracing server where it can be stored/analyzed/visualized.  The point of a non-sampled trace is propagating IDs for logging (trace ID in MDC) -- very light-weight.  Zipkin and Jaeger (and others?) have their own samplers.  When Solr receives a request with a trace ID, in Zipkin it also includes the binary sampling decision (it's another header).  The expectation is that if the trace says to sample, then this sampling decision is propagated downstream and thus the whole call tree is fully sampled (reported to a server).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)