You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@skywalking.apache.org by GitBox <gi...@apache.org> on 2021/02/20 09:52:47 UTC

[GitHub] [skywalking] wu-sheng opened a new issue #6411: [Core Feature] Merge metrics indexes of ElasticSearch based on MetricsFunction/MeterFunction annotation

wu-sheng opened a new issue #6411:
URL: https://github.com/apache/skywalking/issues/6411


   Hi
   
   This is a core feature/improvement proposal. Currently, this is a theoretically solution.
   
   ## Background
   SkyWalking today has a huge extendibility in log/metric/trace analysis. So, many metrics are being or will be added, rather than the old days, we rely on the trace analysis mostly.
   Then, right now, every metric of SkyWalking generated logically maps to one index of ElasticSearch. Even we have merged day and hour level into minute, and provided dayStep to reduce the number of indexes. Still, the number is huge, and most importantly, it is increasing.
   Also, at the same time, the size of metric index is much less than the segments, such as
   ```
   skywalking-online_service_cpm-20210213                                 FB8Yym7aTDqgAgpYJrG_NQ  1 1    941083    2612 191.8mb  95.8mb 
   skywalking-online_endpoint_cpm-20210130                                V8SZg-1qSO6mhvA0m7cs9A  1 1   8777005   36984   3.5gb   1.7gb
   skywalking-online_service_instance_relation_client_cpm-20210123        Qb-cxpTGS7-Qpz02n1T8LQ  1 1  95699175  563518  60.1gb    30gb
   skywalking-online_segment-20210218                                     5aOr9YGeTq2_xfOPypCQUQ 30 0 353400759       0 389.3gb 389.3gb
   ```
   
   ## Proposal
   If we merge `service_cpm`, `endpoint_cpm` and `service_instance_relation_client_cpm` into an index, let's call it, `cpm-*`, the performance impact of query is very limited, because basically the size of data set still ~30GB(60GB because of one replication). 
   To separate these metrics for the SkyWalking query, we need to add metric name in the `ROWID` and as a separated column, which is also not a big issue. The only side-effect is causing 10-20 chars as ROWID.
   Then, consider this grouping mechanism is 100% automatically, so users would feel confused.
   
   At last, speaking from ElasticSearch storage perspective, as the metric/metric function sharing, most columns are still same. But as we need `StorageEsInstaller#createMapping`, we need the unison set for all possible columns of all metrics in one kind. In the previous case, endpoint_cpm has one or two more columns.
   As a result, we need to change `createTable(Model model)` to `createTable(List<Model> models)`, for providing the storage implementation a chance to merge logic model than creating.
   
   @apache/skywalking-committers I am willing to hear your feedback. I talked with @EvanLjp privately, and think this is worth to try.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [skywalking] wu-sheng closed issue #6411: [Core Feature] Merge metrics indexes of ElasticSearch based on MetricsFunction/MeterFunction annotation

Posted by GitBox <gi...@apache.org>.
wu-sheng closed issue #6411:
URL: https://github.com/apache/skywalking/issues/6411


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org