You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@skywalking.apache.org by GitBox <gi...@apache.org> on 2021/02/20 09:52:47 UTC
[GitHub] [skywalking] wu-sheng opened a new issue #6411: [Core Feature] Merge metrics indexes of ElasticSearch based on MetricsFunction/MeterFunction annotation
wu-sheng opened a new issue #6411:
URL: https://github.com/apache/skywalking/issues/6411
Hi
This is a core feature/improvement proposal. Currently, this is a theoretically solution.
## Background
SkyWalking today has a huge extendibility in log/metric/trace analysis. So, many metrics are being or will be added, rather than the old days, we rely on the trace analysis mostly.
Then, right now, every metric of SkyWalking generated logically maps to one index of ElasticSearch. Even we have merged day and hour level into minute, and provided dayStep to reduce the number of indexes. Still, the number is huge, and most importantly, it is increasing.
Also, at the same time, the size of metric index is much less than the segments, such as
```
skywalking-online_service_cpm-20210213 FB8Yym7aTDqgAgpYJrG_NQ 1 1 941083 2612 191.8mb 95.8mb
skywalking-online_endpoint_cpm-20210130 V8SZg-1qSO6mhvA0m7cs9A 1 1 8777005 36984 3.5gb 1.7gb
skywalking-online_service_instance_relation_client_cpm-20210123 Qb-cxpTGS7-Qpz02n1T8LQ 1 1 95699175 563518 60.1gb 30gb
skywalking-online_segment-20210218 5aOr9YGeTq2_xfOPypCQUQ 30 0 353400759 0 389.3gb 389.3gb
```
## Proposal
If we merge `service_cpm`, `endpoint_cpm` and `service_instance_relation_client_cpm` into an index, let's call it, `cpm-*`, the performance impact of query is very limited, because basically the size of data set still ~30GB(60GB because of one replication).
To separate these metrics for the SkyWalking query, we need to add metric name in the `ROWID` and as a separated column, which is also not a big issue. The only side-effect is causing 10-20 chars as ROWID.
Then, consider this grouping mechanism is 100% automatically, so users would feel confused.
At last, speaking from ElasticSearch storage perspective, as the metric/metric function sharing, most columns are still same. But as we need `StorageEsInstaller#createMapping`, we need the unison set for all possible columns of all metrics in one kind. In the previous case, endpoint_cpm has one or two more columns.
As a result, we need to change `createTable(Model model)` to `createTable(List<Model> models)`, for providing the storage implementation a chance to merge logic model than creating.
@apache/skywalking-committers I am willing to hear your feedback. I talked with @EvanLjp privately, and think this is worth to try.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [skywalking] wu-sheng closed issue #6411: [Core Feature] Merge metrics indexes of ElasticSearch based on MetricsFunction/MeterFunction annotation
Posted by GitBox <gi...@apache.org>.
wu-sheng closed issue #6411:
URL: https://github.com/apache/skywalking/issues/6411
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org