You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@skywalking.apache.org by GitBox <gi...@apache.org> on 2020/02/15 13:35:05 UTC
[GitHub] [skywalking] wu-sheng opened a new pull request #4364: Support
Downsampling Data Packing feature in ES storage implementation
wu-sheng opened a new pull request #4364: Support Downsampling Data Packing feature in ES storage implementation
URL: https://github.com/apache/skywalking/pull/4364
### Downsampling Data Packing
Downsampling data packing(`storage/elasticsearch/enablePackedDownsampling`, default activated) is a new feature since 7.0.0.
Metrics data has 4 different precisions, based on `core/default/downsampling` configurations.
In previous(6.x), every precision of each metrics had one separated
index. After this is activated, metrics of day and hour precisions are merged into minute precision. The number of indexes
decreased, and cause less payload to the ElasticSearch server.
### The new data format preview
![image](https://user-images.githubusercontent.com/5441976/74588754-425df300-503a-11ea-937c-b205fcc734a7.png)
After enabling this feature, 3 precisions of metrics data are all existing in one index(minute precision).
### Why query could keep in the same.
From my understanding, once query could use the right name(merged index name), as they are using different timestamp format(because of precision, length is different), they will exclude other precision data automatically. So no change is required.
____
To all @apache/skywalking-committers
This PR has a small(18 files only) change set, but influenced scope is large. And this doesn't have the side-effect, so I active it by default. Please review and give me some feedback.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
[GitHub] [skywalking] aderm commented on issue #4364: Support Downsampling
Data Packing feature in ES storage implementation
Posted by GitBox <gi...@apache.org>.
aderm commented on issue #4364: Support Downsampling Data Packing feature in ES storage implementation
URL: https://github.com/apache/skywalking/pull/4364#issuecomment-586668437
just now get flower mean, the data only increased by 1.7%, which can basically be ignored.So the question I worry about doesn't exist.
> The size of dataset is the root reason of no side effect conclusion. In previous 6.x, minute/hour/day precision data are all organized day by day. So, in one day, the sizes of dataset are 60 * 24(minute), 24(hour) and 1(day) for a single entity(service, instance, endpoint or relationship). We merge them into one, then the increasement of one index is (24 + 1)/(60*24) = 1.7%, but reduce 2 indexes for one metrics. This is not a tradeoff, it is a pure enhancement
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
[GitHub] [skywalking] wu-sheng commented on issue #4364: Support
Downsampling Data Packing feature in ES storage implementation
Posted by GitBox <gi...@apache.org>.
wu-sheng commented on issue #4364: Support Downsampling Data Packing feature in ES storage implementation
URL: https://github.com/apache/skywalking/pull/4364#issuecomment-586662020
Hi @aderm What are your questions indicating? I am a little confused about reading this.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
[GitHub] [skywalking] aderm commented on issue #4364: Support Downsampling
Data Packing feature in ES storage implementation
Posted by GitBox <gi...@apache.org>.
aderm commented on issue #4364: Support Downsampling Data Packing feature in ES storage implementation
URL: https://github.com/apache/skywalking/pull/4364#issuecomment-586660956
> About `writing to/querying from the merged index`, we always query/write from metrics in minute precision, if there is no performance issue there, there is no issue for merged index. Right?
Regarding the query of the index, if 3 precisions of metrics are merged into the minute index, have two small problems:
1. When the ES server loads index memory data, because lucene index storage is similar to columnar storage, the merge will take up more memory. Will timebucket or other field queries consume more resources? evaluate the memory resources at the minute level? Although the result set of the search is the same size;
2. When the SW daily data volume is large, reaching a single index to the level of 10 million or 100 million, will the performance not change if the index conditions are not changed? Maybe more scenarios are minute-level queries.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
[GitHub] [skywalking] wu-sheng commented on issue #4364: Support
Downsampling Data Packing feature in ES storage implementation
Posted by GitBox <gi...@apache.org>.
wu-sheng commented on issue #4364: Support Downsampling Data Packing feature in ES storage implementation
URL: https://github.com/apache/skywalking/pull/4364#issuecomment-586596886
### Explanation of No Side Effect
The size of dataset is the root reason of no side effect conclusion. In previous 6.x, minute/hour/day precision data are all organized day by day. So, in one day, the sizes of dataset are 60 * 24(minute), 24(hour) and 1(day) for a single entity(service, instance, endpoint or relationship). We merge them into one, then the increasement of one index is (24 + 1)/(60*24) = 1.7%, but reduce 2 indexes for one metrics. This is not a tradeoff, it is a pure enhancement.
### Added bonus
Because of the performance enhancement, the e2e tests will be more stable due to less indexes.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
[GitHub] [skywalking] wu-sheng commented on issue #4364: Support
Downsampling Data Packing feature in ES storage implementation
Posted by GitBox <gi...@apache.org>.
wu-sheng commented on issue #4364: Support Downsampling Data Packing feature in ES storage implementation
URL: https://github.com/apache/skywalking/pull/4364#issuecomment-586600057
> Does this discuss elsewhere(I have impressions)? What about the performance of writing to/querying from the merged index? Because the amount of the data in the index grows, and I’m wondering whether it’s “pure enhancement” or not
No, there isn't. I got reports from others, there is performance concern about the index number. I did the math, look at the previous comment, https://github.com/apache/skywalking/pull/4364#issuecomment-586596886.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
[GitHub] [skywalking] aderm edited a comment on issue #4364: Support
Downsampling Data Packing feature in ES storage implementation
Posted by GitBox <gi...@apache.org>.
aderm edited a comment on issue #4364: Support Downsampling Data Packing feature in ES storage implementation
URL: https://github.com/apache/skywalking/pull/4364#issuecomment-586660956
> About `writing to/querying from the merged index`, we always query/write from metrics in minute precision, if there is no performance issue there, there is no issue for merged index. Right?
Regarding the query of the index, if 3 precisions of metrics are merged into the minute index, have two small problems:
1. When the ES server loads index memory data, because lucene index storage is similar to columnar storage, the merge will take up more memory. Will timebucket or other field queries consume more resources? evaluate the memory resources at the minute level? Although the result set of the search is the same size;
2. When the SW daily data volume is large, reaching a single index to the level of 10 million or 100 million, will the performance not change if the index conditions are not changed?as before said, more scenarios are minute-level queries.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
[GitHub] [skywalking] wu-sheng merged pull request #4364: Support
Downsampling Data Packing feature in ES storage implementation
Posted by GitBox <gi...@apache.org>.
wu-sheng merged pull request #4364: Support Downsampling Data Packing feature in ES storage implementation
URL: https://github.com/apache/skywalking/pull/4364
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
[GitHub] [skywalking] codecov-io edited a comment on issue #4364: Support
Downsampling Data Packing feature in ES storage implementation
Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on issue #4364: Support Downsampling Data Packing feature in ES storage implementation
URL: https://github.com/apache/skywalking/pull/4364#issuecomment-586596068
# [Codecov](https://codecov.io/gh/apache/skywalking/pull/4364?src=pr&el=h1) Report
> Merging [#4364](https://codecov.io/gh/apache/skywalking/pull/4364?src=pr&el=desc) into [master](https://codecov.io/gh/apache/skywalking/commit/5fed153ee2c51584a855e474ea97e97c6b8ec9ca?src=pr&el=desc) will **decrease** coverage by `0.3%`.
> The diff coverage is `0%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/skywalking/pull/4364/graphs/tree.svg?width=650&token=qrILxY5yA8&height=150&src=pr)](https://codecov.io/gh/apache/skywalking/pull/4364?src=pr&el=tree)
```diff
@@ Coverage Diff @@
## master #4364 +/- ##
==========================================
- Coverage 26.39% 26.09% -0.31%
==========================================
Files 1182 1182
Lines 26922 26975 +53
Branches 3714 3719 +5
==========================================
- Hits 7107 7040 -67
- Misses 19182 19301 +119
- Partials 633 634 +1
```
| [Impacted Files](https://codecov.io/gh/apache/skywalking/pull/4364?src=pr&el=tree) | Coverage Δ | |
|---|---|---|
| [...in/elasticsearch7/client/ElasticSearch7Client.java](https://codecov.io/gh/apache/skywalking/pull/4364/diff?src=pr&el=tree#diff-b2FwLXNlcnZlci9zZXJ2ZXItc3RvcmFnZS1wbHVnaW4vc3RvcmFnZS1lbGFzdGljc2VhcmNoNy1wbHVnaW4vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL3NreXdhbGtpbmcvb2FwL3NlcnZlci9zdG9yYWdlL3BsdWdpbi9lbGFzdGljc2VhcmNoNy9jbGllbnQvRWxhc3RpY1NlYXJjaDdDbGllbnQuamF2YQ==) | `0% <0%> (ø)` | :arrow_up: |
| [...sticsearch/StorageModuleElasticsearchProvider.java](https://codecov.io/gh/apache/skywalking/pull/4364/diff?src=pr&el=tree#diff-b2FwLXNlcnZlci9zZXJ2ZXItc3RvcmFnZS1wbHVnaW4vc3RvcmFnZS1lbGFzdGljc2VhcmNoLXBsdWdpbi9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvc2t5d2Fsa2luZy9vYXAvc2VydmVyL3N0b3JhZ2UvcGx1Z2luL2VsYXN0aWNzZWFyY2gvU3RvcmFnZU1vZHVsZUVsYXN0aWNzZWFyY2hQcm92aWRlci5qYXZh) | `0% <0%> (ø)` | :arrow_up: |
| [...rary/client/elasticsearch/ElasticSearchClient.java](https://codecov.io/gh/apache/skywalking/pull/4364/diff?src=pr&el=tree#diff-b2FwLXNlcnZlci9zZXJ2ZXItbGlicmFyeS9saWJyYXJ5LWNsaWVudC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvc2t5d2Fsa2luZy9vYXAvc2VydmVyL2xpYnJhcnkvY2xpZW50L2VsYXN0aWNzZWFyY2gvRWxhc3RpY1NlYXJjaENsaWVudC5qYXZh) | `0% <0%> (ø)` | :arrow_up: |
| [...icsearch7/StorageModuleElasticsearch7Provider.java](https://codecov.io/gh/apache/skywalking/pull/4364/diff?src=pr&el=tree#diff-b2FwLXNlcnZlci9zZXJ2ZXItc3RvcmFnZS1wbHVnaW4vc3RvcmFnZS1lbGFzdGljc2VhcmNoNy1wbHVnaW4vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL3NreXdhbGtpbmcvb2FwL3NlcnZlci9zdG9yYWdlL3BsdWdpbi9lbGFzdGljc2VhcmNoNy9TdG9yYWdlTW9kdWxlRWxhc3RpY3NlYXJjaDdQcm92aWRlci5qYXZh) | `0% <0%> (ø)` | :arrow_up: |
| [...lasticsearch/StorageModuleElasticsearchConfig.java](https://codecov.io/gh/apache/skywalking/pull/4364/diff?src=pr&el=tree#diff-b2FwLXNlcnZlci9zZXJ2ZXItc3RvcmFnZS1wbHVnaW4vc3RvcmFnZS1lbGFzdGljc2VhcmNoLXBsdWdpbi9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvc2t5d2Fsa2luZy9vYXAvc2VydmVyL3N0b3JhZ2UvcGx1Z2luL2VsYXN0aWNzZWFyY2gvU3RvcmFnZU1vZHVsZUVsYXN0aWNzZWFyY2hDb25maWcuamF2YQ==) | `0% <0%> (ø)` | :arrow_up: |
| [...apm/agent/core/remote/GRPCStreamServiceStatus.java](https://codecov.io/gh/apache/skywalking/pull/4364/diff?src=pr&el=tree#diff-YXBtLXNuaWZmZXIvYXBtLWFnZW50LWNvcmUvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL3NreXdhbGtpbmcvYXBtL2FnZW50L2NvcmUvcmVtb3RlL0dSUENTdHJlYW1TZXJ2aWNlU3RhdHVzLmphdmE=) | `41.17% <0%> (-41.18%)` | :arrow_down: |
| [...ache/skywalking/apm/agent/core/jvm/JVMService.java](https://codecov.io/gh/apache/skywalking/pull/4364/diff?src=pr&el=tree#diff-YXBtLXNuaWZmZXIvYXBtLWFnZW50LWNvcmUvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL3NreXdhbGtpbmcvYXBtL2FnZW50L2NvcmUvanZtL0pWTVNlcnZpY2UuamF2YQ==) | `60% <0%> (-20%)` | :arrow_down: |
| [...alking/apm/agent/core/remote/AgentIDDecorator.java](https://codecov.io/gh/apache/skywalking/pull/4364/diff?src=pr&el=tree#diff-YXBtLXNuaWZmZXIvYXBtLWFnZW50LWNvcmUvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL3NreXdhbGtpbmcvYXBtL2FnZW50L2NvcmUvcmVtb3RlL0FnZW50SUREZWNvcmF0b3IuamF2YQ==) | `66.66% <0%> (-18.52%)` | :arrow_down: |
| [.../agent/core/profile/ProfileTaskChannelService.java](https://codecov.io/gh/apache/skywalking/pull/4364/diff?src=pr&el=tree#diff-YXBtLXNuaWZmZXIvYXBtLWFnZW50LWNvcmUvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL3NreXdhbGtpbmcvYXBtL2FnZW50L2NvcmUvcHJvZmlsZS9Qcm9maWxlVGFza0NoYW5uZWxTZXJ2aWNlLmphdmE=) | `27.95% <0%> (-18.28%)` | :arrow_down: |
| [.../core/remote/ServiceAndEndpointRegisterClient.java](https://codecov.io/gh/apache/skywalking/pull/4364/diff?src=pr&el=tree#diff-YXBtLXNuaWZmZXIvYXBtLWFnZW50LWNvcmUvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL3NreXdhbGtpbmcvYXBtL2FnZW50L2NvcmUvcmVtb3RlL1NlcnZpY2VBbmRFbmRwb2ludFJlZ2lzdGVyQ2xpZW50LmphdmE=) | `30.61% <0%> (-15.31%)` | :arrow_down: |
| ... and [5 more](https://codecov.io/gh/apache/skywalking/pull/4364/diff?src=pr&el=tree-more) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/skywalking/pull/4364?src=pr&el=continue).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/skywalking/pull/4364?src=pr&el=footer). Last update [5fed153...9dcfd97](https://codecov.io/gh/apache/skywalking/pull/4364?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
[GitHub] [skywalking] wu-sheng edited a comment on issue #4364: Support
Downsampling Data Packing feature in ES storage implementation
Posted by GitBox <gi...@apache.org>.
wu-sheng edited a comment on issue #4364: Support Downsampling Data Packing feature in ES storage implementation
URL: https://github.com/apache/skywalking/pull/4364#issuecomment-586600269
About `writing to/querying from the merged index`, we always query/write from metrics in minute precision, if there is no performance issue there, there is no issue for merged index. Right?
The key point of this, is the sizes of different precision datasets are huge, so separating them is meaningless. If one index have too much data, increasing shards num is a better idea. Agree?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
[GitHub] [skywalking] aderm edited a comment on issue #4364: Support
Downsampling Data Packing feature in ES storage implementation
Posted by GitBox <gi...@apache.org>.
aderm edited a comment on issue #4364: Support Downsampling Data Packing feature in ES storage implementation
URL: https://github.com/apache/skywalking/pull/4364#issuecomment-586660956
> About `writing to/querying from the merged index`, we always query/write from metrics in minute precision, if there is no performance issue there, there is no issue for merged index. Right?
Regarding the query of the index, if 3 precisions of metrics are merged into the minute index, have two little questions:
1. When the ES server loads index memory data, because lucene index storage is similar to columnar storage, the merge will take up more memory and cpu. Will timebucket or other field queries consume more resources? evaluate the memory resources at the minute level? Although the result set of the search is the same size;
2. When the SW daily data volume is large, reaching a single index to the level of 10 million or 100 million, will the performance not change if the index conditions are not changed?as before said, more scenarios are minute-level queries.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
[GitHub] [skywalking] codecov-io commented on issue #4364: Support
Downsampling Data Packing feature in ES storage implementation
Posted by GitBox <gi...@apache.org>.
codecov-io commented on issue #4364: Support Downsampling Data Packing feature in ES storage implementation
URL: https://github.com/apache/skywalking/pull/4364#issuecomment-586596068
# [Codecov](https://codecov.io/gh/apache/skywalking/pull/4364?src=pr&el=h1) Report
> Merging [#4364](https://codecov.io/gh/apache/skywalking/pull/4364?src=pr&el=desc) into [master](https://codecov.io/gh/apache/skywalking/commit/5fed153ee2c51584a855e474ea97e97c6b8ec9ca?src=pr&el=desc) will **decrease** coverage by `0.05%`.
> The diff coverage is `0%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/skywalking/pull/4364/graphs/tree.svg?width=650&token=qrILxY5yA8&height=150&src=pr)](https://codecov.io/gh/apache/skywalking/pull/4364?src=pr&el=tree)
```diff
@@ Coverage Diff @@
## master #4364 +/- ##
==========================================
- Coverage 26.14% 26.09% -0.06%
==========================================
Files 1182 1182
Lines 26922 26976 +54
Branches 3714 3719 +5
==========================================
Hits 7040 7040
- Misses 19248 19302 +54
Partials 634 634
```
| [Impacted Files](https://codecov.io/gh/apache/skywalking/pull/4364?src=pr&el=tree) | Coverage Δ | |
|---|---|---|
| [...in/elasticsearch7/client/ElasticSearch7Client.java](https://codecov.io/gh/apache/skywalking/pull/4364/diff?src=pr&el=tree#diff-b2FwLXNlcnZlci9zZXJ2ZXItc3RvcmFnZS1wbHVnaW4vc3RvcmFnZS1lbGFzdGljc2VhcmNoNy1wbHVnaW4vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL3NreXdhbGtpbmcvb2FwL3NlcnZlci9zdG9yYWdlL3BsdWdpbi9lbGFzdGljc2VhcmNoNy9jbGllbnQvRWxhc3RpY1NlYXJjaDdDbGllbnQuamF2YQ==) | `0% <0%> (ø)` | :arrow_up: |
| [...sticsearch/StorageModuleElasticsearchProvider.java](https://codecov.io/gh/apache/skywalking/pull/4364/diff?src=pr&el=tree#diff-b2FwLXNlcnZlci9zZXJ2ZXItc3RvcmFnZS1wbHVnaW4vc3RvcmFnZS1lbGFzdGljc2VhcmNoLXBsdWdpbi9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvc2t5d2Fsa2luZy9vYXAvc2VydmVyL3N0b3JhZ2UvcGx1Z2luL2VsYXN0aWNzZWFyY2gvU3RvcmFnZU1vZHVsZUVsYXN0aWNzZWFyY2hQcm92aWRlci5qYXZh) | `0% <0%> (ø)` | :arrow_up: |
| [...rary/client/elasticsearch/ElasticSearchClient.java](https://codecov.io/gh/apache/skywalking/pull/4364/diff?src=pr&el=tree#diff-b2FwLXNlcnZlci9zZXJ2ZXItbGlicmFyeS9saWJyYXJ5LWNsaWVudC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvc2t5d2Fsa2luZy9vYXAvc2VydmVyL2xpYnJhcnkvY2xpZW50L2VsYXN0aWNzZWFyY2gvRWxhc3RpY1NlYXJjaENsaWVudC5qYXZh) | `0% <0%> (ø)` | :arrow_up: |
| [...icsearch7/StorageModuleElasticsearch7Provider.java](https://codecov.io/gh/apache/skywalking/pull/4364/diff?src=pr&el=tree#diff-b2FwLXNlcnZlci9zZXJ2ZXItc3RvcmFnZS1wbHVnaW4vc3RvcmFnZS1lbGFzdGljc2VhcmNoNy1wbHVnaW4vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL3NreXdhbGtpbmcvb2FwL3NlcnZlci9zdG9yYWdlL3BsdWdpbi9lbGFzdGljc2VhcmNoNy9TdG9yYWdlTW9kdWxlRWxhc3RpY3NlYXJjaDdQcm92aWRlci5qYXZh) | `0% <0%> (ø)` | :arrow_up: |
| [...lasticsearch/StorageModuleElasticsearchConfig.java](https://codecov.io/gh/apache/skywalking/pull/4364/diff?src=pr&el=tree#diff-b2FwLXNlcnZlci9zZXJ2ZXItc3RvcmFnZS1wbHVnaW4vc3RvcmFnZS1lbGFzdGljc2VhcmNoLXBsdWdpbi9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvc2t5d2Fsa2luZy9vYXAvc2VydmVyL3N0b3JhZ2UvcGx1Z2luL2VsYXN0aWNzZWFyY2gvU3RvcmFnZU1vZHVsZUVsYXN0aWNzZWFyY2hDb25maWcuamF2YQ==) | `0% <0%> (ø)` | :arrow_up: |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/skywalking/pull/4364?src=pr&el=continue).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/skywalking/pull/4364?src=pr&el=footer). Last update [5fed153...10bcc87](https://codecov.io/gh/apache/skywalking/pull/4364?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
[GitHub] [skywalking] hanahmily commented on a change in pull request #4364:
Support Downsampling Data Packing feature in ES storage implementation
Posted by GitBox <gi...@apache.org>.
hanahmily commented on a change in pull request #4364: Support Downsampling Data Packing feature in ES storage implementation
URL: https://github.com/apache/skywalking/pull/4364#discussion_r379965111
##########
File path: oap-server/server-bootstrap/src/main/resources/application.yml
##########
@@ -104,6 +105,7 @@ storage:
protocol: ${SW_STORAGE_ES_HTTP_PROTOCOL:"http"}
#trustStorePath: ${SW_SW_STORAGE_ES_SSL_JKS_PATH:"../es_keystore.jks"}
#trustStorePass: ${SW_SW_STORAGE_ES_SSL_JKS_PASS:""}
+ enablePackedDownsampling: ${SW_STORAGE_ENABLE_PACKED_DOWNSAMPLING:true} # Hour and Day metrics will be merged into minute index.
user: ${SW_ES_USER:""}
password: ${SW_ES_PASSWORD:""}
indexShardsNumber: ${SW_STORAGE_ES_INDEX_SHARDS_NUMBER:2}
Review comment:
Do we need to increase it when packed downsampling enabled?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
[GitHub] [skywalking] kezhenxu94 commented on issue #4364: Support
Downsampling Data Packing feature in ES storage implementation
Posted by GitBox <gi...@apache.org>.
kezhenxu94 commented on issue #4364: Support Downsampling Data Packing feature in ES storage implementation
URL: https://github.com/apache/skywalking/pull/4364#issuecomment-586599712
Does this discuss elsewhere(I have impressions)? What about the performance of writing to/querying from the merged index? Because the amount of the data in the index grows, and I’m wondering whether it’s “pure enhancement” or not
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
[GitHub] [skywalking] wu-sheng commented on issue #4364: Support
Downsampling Data Packing feature in ES storage implementation
Posted by GitBox <gi...@apache.org>.
wu-sheng commented on issue #4364: Support Downsampling Data Packing feature in ES storage implementation
URL: https://github.com/apache/skywalking/pull/4364#issuecomment-586600269
About `writing to/querying from the merged index`, we always query/write from metrics in minute precision, if there is no performance issue there, there is no issue for merged index. Right?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
[GitHub] [skywalking] aderm edited a comment on issue #4364: Support
Downsampling Data Packing feature in ES storage implementation
Posted by GitBox <gi...@apache.org>.
aderm edited a comment on issue #4364: Support Downsampling Data Packing feature in ES storage implementation
URL: https://github.com/apache/skywalking/pull/4364#issuecomment-586660956
> About `writing to/querying from the merged index`, we always query/write from metrics in minute precision, if there is no performance issue there, there is no issue for merged index. Right?
Regarding the query of the index, if 3 precisions of metrics are merged into the minute index, have two small problems:
1. When the ES server loads index memory data, because lucene index storage is similar to columnar storage, the merge will take up more memory and cpu. Will timebucket or other field queries consume more resources? evaluate the memory resources at the minute level? Although the result set of the search is the same size;
2. When the SW daily data volume is large, reaching a single index to the level of 10 million or 100 million, will the performance not change if the index conditions are not changed?as before said, more scenarios are minute-level queries.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
[GitHub] [skywalking] kezhenxu94 edited a comment on issue #4364: Support
Downsampling Data Packing feature in ES storage implementation
Posted by GitBox <gi...@apache.org>.
kezhenxu94 edited a comment on issue #4364: Support Downsampling Data Packing feature in ES storage implementation
URL: https://github.com/apache/skywalking/pull/4364#issuecomment-586599712
Did we discuss this elsewhere(I have impressions)? What about the performance of writing to/querying from the merged index? Because the amount of the data in the index grows, and I’m wondering whether it’s “pure enhancement” or not
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
[GitHub] [skywalking] wu-sheng commented on a change in pull request #4364:
Support Downsampling Data Packing feature in ES storage implementation
Posted by GitBox <gi...@apache.org>.
wu-sheng commented on a change in pull request #4364: Support Downsampling Data Packing feature in ES storage implementation
URL: https://github.com/apache/skywalking/pull/4364#discussion_r379835434
##########
File path: oap-server/server-storage-plugin/storage-elasticsearch-plugin/src/main/java/org/apache/skywalking/oap/server/storage/plugin/elasticsearch/StorageModuleElasticsearchProvider.java
##########
@@ -174,4 +197,63 @@ private void overrideCoreModuleTTLConfig() {
configService.getDataTTLConfig().setDayMetricsDataTTL(config.getDayMetricsDataTTL());
configService.getDataTTLConfig().setMonthMetricsDataTTL(config.getMonthMetricsDataTTL());
}
+
+ public static List<IndexNameConverter> indexNameConverters(String namespace, boolean enablePackedDownsampling) {
+ List<IndexNameConverter> converters = new ArrayList<>();
+
+ if (enablePackedDownsampling) {
+ // Packed downsampling converter.
+ converters.add(new PackedDownsamplingConverter(enablePackedDownsampling));
Review comment:
Fixed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
[GitHub] [skywalking] kezhenxu94 commented on a change in pull request
#4364: Support Downsampling Data Packing feature in ES storage implementation
Posted by GitBox <gi...@apache.org>.
kezhenxu94 commented on a change in pull request #4364: Support Downsampling Data Packing feature in ES storage implementation
URL: https://github.com/apache/skywalking/pull/4364#discussion_r379834652
##########
File path: oap-server/server-storage-plugin/storage-elasticsearch-plugin/src/main/java/org/apache/skywalking/oap/server/storage/plugin/elasticsearch/StorageModuleElasticsearchProvider.java
##########
@@ -174,4 +197,63 @@ private void overrideCoreModuleTTLConfig() {
configService.getDataTTLConfig().setDayMetricsDataTTL(config.getDayMetricsDataTTL());
configService.getDataTTLConfig().setMonthMetricsDataTTL(config.getMonthMetricsDataTTL());
}
+
+ public static List<IndexNameConverter> indexNameConverters(String namespace, boolean enablePackedDownsampling) {
+ List<IndexNameConverter> converters = new ArrayList<>();
+
+ if (enablePackedDownsampling) {
+ // Packed downsampling converter.
+ converters.add(new PackedDownsamplingConverter(enablePackedDownsampling));
Review comment:
nit: `enablePackedDownsampling` is always `true`, and it's never used in class `PackedDownsamplingConverter `
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
[GitHub] [skywalking] aderm edited a comment on issue #4364: Support
Downsampling Data Packing feature in ES storage implementation
Posted by GitBox <gi...@apache.org>.
aderm edited a comment on issue #4364: Support Downsampling Data Packing feature in ES storage implementation
URL: https://github.com/apache/skywalking/pull/4364#issuecomment-586668437
ops, just now get flower mean, the data only increased by 1.7%, which can basically be ignored.So the question I worry about doesn't exist.
> The size of dataset is the root reason of no side effect conclusion. In previous 6.x, minute/hour/day precision data are all organized day by day. So, in one day, the sizes of dataset are 60 * 24(minute), 24(hour) and 1(day) for a single entity(service, instance, endpoint or relationship). We merge them into one, then the increasement of one index is (24 + 1)/(60*24) = 1.7%, but reduce 2 indexes for one metrics. This is not a tradeoff, it is a pure enhancement
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services