You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@couchdb.apache.org by ii...@apache.org on 2019/09/23 11:56:29 UTC
[couchdb-documentation] branch master updated: RFC-011 :
Opentracing support
This is an automated email from the ASF dual-hosted git repository.
iilyak pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/couchdb-documentation.git
The following commit(s) were added to refs/heads/master by this push:
new 2cf5d9d RFC-011 : Opentracing support
new d611484 Merge pull request #440 from cloudant/011-opentracing-support
2cf5d9d is described below
commit 2cf5d9d82dd56cbe4b299cbe142e4246ee406da6
Author: ILYA Khlopotov <ii...@apache.org>
AuthorDate: Mon Sep 16 12:20:47 2019 -0700
RFC-011 : Opentracing support
---
rfcs/011-opentracing.md | 236 ++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 236 insertions(+)
diff --git a/rfcs/011-opentracing.md b/rfcs/011-opentracing.md
new file mode 100644
index 0000000..bf4a059
--- /dev/null
+++ b/rfcs/011-opentracing.md
@@ -0,0 +1,236 @@
+---
+name: Opentracing support
+about: Adopt industry standard distributed tracing solution
+title: 'Opentracing support'
+labels: rfc, discussion
+assignees: ''
+
+---
+
+Adopt an industry standard vendor-neutral APIs and instrumentation for distributed tracing.
+
+# Introduction
+
+Collecting profiling data is very tricky at the moment.
+Developers have to run generic profiling tools which are not aware of CouchDB specifics.
+This makes it hard to do the performance optimization work. We need a tool which would
+allow us to get profiling data from specific points in the codebase.
+This means code instrumentation.
+
+## Abstract
+
+There is an https://opentracing.io/ project, which is a vendor-neutral API and instrumentation
+for distributed tracing. In Erlang it is implemented by one of the following libraries:
+ - [otters](https://github.com/project-fifo/otters) extended and more performant version of `otter`
+ - [opentracing-erlang](https://github.com/opentracing-contrib/opentracing-erlang) `otter` version donated to opentracing project.
+ - [original otter](https://github.com/Bluehouse-Technology/otter)
+ - [passage](https://github.com/sile/jaeger_passage)
+
+The opentracing philosophy is founded on three pillars:
+- Low overhead: the tracing system should have a negligible performance impact on running services.
+- Application-level transparency: programmers should not need to be aware of the tracing system
+- Scalability
+
+The main addition is to include one of the above mentioned libraries and add instrumentation points into the codebase.
+In initial implementation, there would be a new span started on every HTTP request.
+The following HTTP headers would be used to link tracing span with application specific traces.
+- X-B3-ParentSpanId
+- X-B3-TraceId
+- b3
+
+More information about the use of these headers can be found [here](https://github.com/openzipkin/b3-propagation).
+Open tracing [specification](https://github.com/opentracing/specification/blob/master/specification.md)
+has a number of [conventions](https://github.com/opentracing/specification/blob/master/semantic_conventions.md)
+which would be good to follow.
+
+In a nutshell the idea is:
+- Take the reference to Parent span from one of the supported header and pass it to `span_start` call.
+- Construct action name to use in `span_start` call.
+- Call `span_start` from `chttpd:handle_request_int/1`.
+- Pass span in `#httpd{}` record
+- Pass `trace_id` and `parent_span_id` through the stack (extend records if needed)
+- Attach span tags to better identify trace events.
+- Attach span logs at important instrumentation points.
+- Forward spans to external service.
+
+## Requirements Language
+
+The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
+"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
+document are to be interpreted as described in
+[RFC 2119](https://www.rfc-editor.org/rfc/rfc2119.txt).
+
+## Terminology
+
+- [span](https://github.com/opentracing/specification/blob/1.1/specification.md#the-opentracing-data-model): The "span"
+ is the primary building block of a distributed trace, representing an individual unit of work done in a distributed system.
+ Each Span encapsulates the following state:
+ - An operation name
+ - A start timestamp
+ - A finish timestamp
+ - A set of zero or more key:value `Span Tags`.
+ - A set of zero or more structured logs (key:value `Span Logs`).
+ - A `SpanContext`
+ - `References` to zero or more causally-related `Spans`
+
+---
+
+# Detailed Description
+
+## Selection of a library
+
+As mentioned earlier, there are two flavours of libraries. None of them is perfect for all use cases.
+The biggest differences in between `otters` and `passage` are:
+
+| | otters | passage |
+| ------------------------------ | ----------- | ------------------------- |
+| reporting protocol | http | udp |
+| filtering | custom DSL | sampling callback module |
+| reporter | zipkin only | jaeger or plugin |
+| functional API | + | + |
+| process dictionary | + | + |
+| process based span storage | + | - |
+| send event in batches | + | - |
+| sender overload detection | - | + |
+| report batches based on | timer | spans of single operation |
+| design for performance | + | - |
+| design for robustness at scale | - | + |
+| counters | + | - |
+| sampling based on duration | + | - |
+| number of extra dependencies | 1 | 3 |
+
+In order to allow future replacement of a tracing library it would be desirable to create an interface module `couch_trace`.
+The `otters` library would be used for the first iteration.
+
+## Configuration
+
+The `otters` library uses application environment to store its configuration.
+It also has a facility to compile filtering DSL into a beam module.
+The filtering DSL looks like following: `<name>([<condition>]) -> <action>.`.
+The safety of DSL compiler is unknown. Therefore a modification of tracing settings via configuration over HTTP wouldn't be possible.
+The otter related section of the config `tracing.filters` would be protected by BLACKLIST_CONFIG_SECTIONS.
+The configuration of tracing would only be allowed from remsh or modification of the ini file.
+The configuration for otter filters would be stored in couch_config as follows:
+```
+[tracing.filters]
+
+<name> = ([<condition>]) -> <action>.
+```
+
+## Tracing related HTTP headers
+
+Following headers on the request would be supported
+- X-B3-ParentSpanId : 16 lower-hex characters
+- X-B3-TraceId : 32 lower-hex characters
+- X-B3-SpanId : 16 lower-hex characters
+- b3 : {TraceId}-{SpanId}-{SamplingState}-{ParentSpanId}
+ - the `SamplingState` would be ignored
+
+Following headers on the response would be supported
+- X-B3-ParentSpanId : 16 lower-hex characters
+- X-B3-TraceId : 32 lower-hex characters
+- X-B3-SpanId : 16 lower-hex characters
+
+## Conventions
+
+The conventions bellow are based on [conventions from opentracing](https://github.com/opentracing/specification/blob/master/semantic_conventions.md#standard-span-tags-and-log-fields).
+All tags are optional since it is just a recomendation from open tracing to hint visualization and filtering tools.
+
+### Span tags
+
+| Span tag name | Type | Notes and examples |
+| ---------------- | ------- | --------------------------------------------------- |
+| component | string | couchdb.<app> (e.g. couchdb.chttpd, couchdb.fabric) |
+| db.instance | string | for fdb-layer would be fdb connection string |
+| db.type | string | for fdb-layer would be fdb |
+| error | bool | `true` if operation failed |
+| http.method | string | HTTP method of the request for the associated Span |
+| http.status_code | integer | HTTP response status code for the associated Span |
+| http.url | string | sanitized URL of the request in URI format |
+| span.kind | string | Either `client` or `server` (RPC roles). |
+| user | string | Authenticated user name |
+| db.name | string | Name of the accessed database |
+| db.shard | string | Name of the accessed shard |
+| nonce | string | Nonce used for the request |
+
+
+### Log fields
+
+| Span log field name | Type | Notes and examples |
+| ------------------- | ------- | ------------------------------------------- |
+| error.kind | string | The "kind" of an error (error, exit, throw) |
+| message | string | human-readable, one-line message |
+| stack | string | A stack trace (\n between lines) |
+
+## Multicomponent traces
+
+CouchDB has complex architecture. The request handling crosses layers' and components' boundaries.
+Every component or layer would start a new span. It *MUST* specify its parent span in order
+for visualization tools to work. The value of a TraceId *MUST* be included in every span start.
+The value of TraceId and SpanId *MAY* be passed to FDB when
+[foundationdb#2085](https://github.com/apple/foundationdb/issues/2085) is resolved.
+
+## Roadmap
+
+- initial implementation as described in this document
+- extend rexi to pass traceid and parentspanid
+- redo otter configuration
+- add tracing to server initiated jobs (compaction, replication)
+- rewrite `otters_conn_zipkin:send_buffer/0` to make it more robust
+- switch `otters_conn_zipkin` from `thrift` to `gRPC`
+
+
+# Advantages and Disadvantages
+
+## Drawbacks
+
+Specifically for `otters` library there are following concerns:
+- safety of configuration mechanism
+- the robustness of the zipkin sender
+
+## Advantages
+
+- Ability to forward tracing events to external system for further analysis
+- Low overhead
+- Structured logging for span logs
+- Link all events to same parent trace id
+
+# Key Changes
+
+- New configuration section
+- New dependencies
+- Additional HTTP headers
+- Additional fields in some records
+
+## Applications and Modules affected
+
+- chttpd
+- couch_trace (new module)
+
+## HTTP API additions
+
+Support for following headers would be added:
+- X-B3-ParentSpanId
+- X-B3-TraceId
+- b3
+
+## HTTP API deprecations
+
+N/A
+
+# Security Considerations
+
+The security risk of injecting malicious payload into ini config is mitigated via placing the section into BLACKLIST_CONFIG_SECTIONS.
+
+# References
+
+- [opentracing specification](https://github.com/opentracing/specification/blob/master/specification.md)
+- https://opentracing.io/
+- https://www.jaegertracing.io/docs/1.14/
+- https://zipkin.io
+- [opentracing conventions](https://github.com/opentracing/specification/blob/master/semantic_conventions.md)
+
+
+# Acknowledgements
+
+[TIP]: # ( Who helped you write this RFC? )