You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@skywalking.apache.org by ke...@apache.org on 2021/02/17 12:35:02 UTC

[skywalking] branch lal created (now 33509ce)

This is an automated email from the ASF dual-hosted git repository.

kezhenxu94 pushed a change to branch lal
in repository https://gitbox.apache.org/repos/asf/skywalking.git.


      at 33509ce  Introduce log analysis language (LAL)

This branch includes the following new commits:

     new 33509ce  Introduce log analysis language (LAL)

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.



[skywalking] 01/01: Introduce log analysis language (LAL)

Posted by ke...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

kezhenxu94 pushed a commit to branch lal
in repository https://gitbox.apache.org/repos/asf/skywalking.git

commit 33509ce8be6f5532e912de479e5258bb6e45da8c
Author: kezhenxu94 <ke...@apache.org>
AuthorDate: Wed Feb 17 20:34:32 2021 +0800

    Introduce log analysis language (LAL)
---
 docs/en/concepts-and-designs/lal.md                | 191 +++++++++++++++++++++
 .../src/main/resources/lal/default.yaml            |  26 +++
 2 files changed, 217 insertions(+)

diff --git a/docs/en/concepts-and-designs/lal.md b/docs/en/concepts-and-designs/lal.md
new file mode 100644
index 0000000..48e14fc
--- /dev/null
+++ b/docs/en/concepts-and-designs/lal.md
@@ -0,0 +1,191 @@
+# Log Analysis Language
+
+Log Analysis Language (LAL) in SkyWalking is essentially a Domain-Specific Language (DSL) to analyze logs. You can use
+LAL to parse, extract, filter, analyze, and save the logs, as well as collaborate the logs with traces, and metrics.
+
+Check the [`default.yaml`](../../../oap-server/server-bootstrap/src/main/resources/lal/default.yaml) for examples.
+
+## Filter
+
+A filter is a group of [parser](#parser), [extractor](#extractor) and [sink](#sink). Users can use one or more filters
+to organize their processing logics. Every piece of log will be sent to all filters in an LAL rule.
+
+### Parser
+
+Parsers are responsible for parsing the raw logs into structured data in SkyWalking for further processing. There are 3
+types of parsers at the moment, namely `json`, `yaml`, and `text`.
+
+When a piece of log is parsed, there is a corresponding property available, called `parsed`, injected by LAL.
+Property `parsed` is typically a map, containing all the fields parsed from the raw logs, for example, if the parser
+is `json` / `yaml`, `parsed` is a map containing all the key-values in the `json` / `yaml`, if the parse is `text`
+, `parsed` is a map containing all the captured groups and their values (for `regexp` and `grok`). See examples below.
+
+#### `json`
+
+<!-- TODO: is structured in the reported (gRPC) `LogData`, not much to do -->
+
+#### `yaml`
+
+<!-- TODO: is structured in the reported (gRPC) `LogData`, not much to do -->
+
+#### `text`
+
+For unstructured logs, there are some `text` parsers for use.
+
+- `regexp`
+
+`regexp` parser uses a regular expression (`regexp`) to parse the logs. It leverages the captured groups of the regexp,
+all the captured groups can be used later in the extractors or sinks.
+
+```groovy
+filter {
+    text {
+        regexp "(?<timestamp>\\d{8}) (?<thread>\\w+) (?<level>\\w+) (?<traceId>\\w+) (?<msg>.+)"
+        // this is just a demo pattern
+    }
+    extractor {
+        tag key: "level", val: parsed["level"]
+        // we add a tag called `level` and its value is parsed["level"], captured from the regexp above
+        tid parsed["traceId"]
+        // we also extract the trace id from the parsed result, which will be used to associate the log with the trace
+    }
+    // ...
+}
+```
+
+- `grok`
+
+<!-- TODO: grok Java library has poor performance, need to benchmark it, the idea is basically the same with `regexp` above -->
+
+### Extractor
+
+Extractors aim to extract metadata from the logs. The metadata can be a service name, a service instance name, an
+endpoint name, or even a trace ID, all of which can be associated with the existing traces and metrics.
+
+- `service`
+
+`service` extracts the service name from the `parsed` result, and set it into the `LogData`, which will be persisted (if
+not dropped) and is used to associate with traces / metrics.
+
+- `instance`
+
+`instance` extracts the service instance name from the `parsed` result, and set it into the `LogData`, which will be
+persisted (if not dropped) and is used to associate with traces / metrics.
+
+- `endpoint`
+
+`endpoint` extracts the service instance name from the `parsed` result, and set it into the `LogData`, which will be
+persisted (if not dropped) and is used to associate with traces / metrics.
+
+- `tid`
+
+`tid` extracts the trace ID from the `parsed` result, and set it into the `LogData`, which will be
+persisted (if not dropped) and is used to associate with traces / metrics.
+
+- `metrics`
+
+`metrics` extracts / generates metrics from the logs. The supported metrics are `CounterMetrics`, `GaugeMetrics`,
+and `HistogramMetrics`, examples are as follows:
+
+```groovy
+filter {
+    // ...
+    extractor {
+        service parsed["serviceName"]
+        metrics {
+            counter {
+                name: "logsCount"
+                tips: "The total count of received logs"
+                tags: ["key1": "value1", "key2": "value2"]
+            }
+            gauge {
+                name: "whatever"
+                tips: "whatever"
+                tags: ["k1": "v1", "k2": "v2"]
+            }
+        }
+    }
+    // ...
+}
+```
+
+### Sink
+
+Sinks are the persistent layer of the LAL. By default, all the logs of each filter are persisted into the storage. However, there are some mechanisms that allow you to selectively save some logs, or even drop all the logs after you've extracted useful information, such as metrics.
+
+#### Sampler
+
+Sampler allows you to save the logs in a sampling manner. Currently, 2 sampling strategies are supported, `ratelimit` and `probabilistic`.
+
+`ratelimit` samples `n` logs at most, in a given duration (e.g. 1 second).
+`probabilistic` samples `n%` logs .
+
+Examples:
+
+```groovy
+filter {
+    // ... parser
+    
+    sampler {
+        ratelimit 100.per.second // 100 logs per second
+    }
+}
+```
+
+```groovy
+filter {
+    // ... parser
+    
+    sampler {
+        probabilistic 50.percent // 50% logs
+    }
+}
+```
+
+#### Dropper
+
+Dropper is a special sink, meaning that all the logs are dropped without any exception. This is useful when you want to drop debugging logs,
+
+```groovy
+filter {
+    // ... parser
+    
+    sink {
+        if (log.level == "DEBUG") {
+            dropper {}
+        } else {
+            sampler {
+                // ... configs
+            }
+        }
+    }
+}
+```
+
+or you have multiple filters, some of which are for extracting metrics, only one of them needs to be persisted.
+
+```groovy
+filter { // filter A: this is for persistence
+    // ... parser
+
+    sink {
+        sampler {
+            // .. sampler configs
+        }
+    }
+}
+filter { // filter B:
+    // ... extractors to generate many metrics
+    extractors {
+        metrics {
+            // ... counter
+            // ... gauge
+            // ... histogram
+            // ... etc.
+        }
+    }
+    sink {
+        dropper {} // drop all logs because they have been saved in "filter A" above.
+    }
+}
+```
diff --git a/oap-server/server-bootstrap/src/main/resources/lal/default.yaml b/oap-server/server-bootstrap/src/main/resources/lal/default.yaml
new file mode 100644
index 0000000..f692245
--- /dev/null
+++ b/oap-server/server-bootstrap/src/main/resources/lal/default.yaml
@@ -0,0 +1,26 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+rules:
+  - name: default
+    dsl: |
+      filter {
+        text {
+          regexp $/now: (?<ts>\d+)/$
+          add_tag key: "timestamp", val: parsed["ts"]
+        }
+        sink {
+        }
+      }