You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@skywalking.apache.org by "lujiajing1126 (via GitHub)" <gi...@apache.org> on 2023/02/25 14:54:33 UTC

[GitHub] [skywalking] lujiajing1126 opened a new pull request, #10448: Integrate BanyanDB server-side TopN

lujiajing1126 opened a new pull request, #10448:
URL: https://github.com/apache/skywalking/pull/10448

   - [x] If this pull request closes/resolves/fixes an existing issue, replace the issue number. Partially resolve #9864.
   - [ ] Update the [`CHANGES` log](https://github.com/apache/skywalking/blob/master/docs/en/changes/changes.md).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng commented on a diff in pull request #10448: Integrate BanyanDB server-side TopN

Posted by "wu-sheng (via GitHub)" <gi...@apache.org>.
wu-sheng commented on code in PR #10448:
URL: https://github.com/apache/skywalking/pull/10448#discussion_r1118928239


##########
oap-server/server-storage-plugin/storage-banyandb-plugin/src/main/java/org/apache/skywalking/oap/server/storage/plugin/banyandb/BanyanDBAggregationQueryDAO.java:
##########
@@ -85,24 +60,45 @@ protected void apply(MeasureQuery query) {
             throw new IOException("field spec is not registered");
         }
 
+        if (schema.getTopNSpec() == null) {
+            throw new IOException("TopN spec is registered");
+        }
+
+        TopNQueryResponse resp = null;
+        if (condition.getOrder() == Order.DES) {
+            resp = topN(schema, timestampRange, condition.getTopN(), additionalConditions);
+        } else {
+            resp = bottomN(schema, timestampRange, condition.getTopN(), additionalConditions);
+        }

Review Comment:
   I mean there are two kinds of topN.
   1. Where **service_id = xxx** and time_range in [start, end] Group by entityID
   2. Where and time_range in [start, end] Group by entityID
   
   Check the query condition protocol. 
   https://github.com/apache/skywalking-query-protocol/blob/450328c6445d4dc50a868a0b797a14755f1d82cc/metrics-v2.graphqls#L74-L75



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] lujiajing1126 commented on pull request #10448: Integrate BanyanDB server-side TopN

Posted by "lujiajing1126 (via GitHub)" <gi...@apache.org>.
lujiajing1126 commented on PR #10448:
URL: https://github.com/apache/skywalking/pull/10448#issuecomment-1445375692

   I've tested some cases. It worked,
   
   endpoint_cpm
   
   <img width="651" alt="image" src="https://user-images.githubusercontent.com/2568208/221416722-8eca761f-3620-4de8-8ed6-b32676827061.png">
   
   service_cpm
   
   <img width="637" alt="image" src="https://user-images.githubusercontent.com/2568208/221416737-bb3866ea-a5c3-44e9-bf3c-6982da421725.png">
   
   Now we have to decide how we can define the parameters? May be in the annotation?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] lujiajing1126 commented on pull request #10448: Integrate BanyanDB server-side TopN

Posted by "lujiajing1126 (via GitHub)" <gi...@apache.org>.
lujiajing1126 commented on PR #10448:
URL: https://github.com/apache/skywalking/pull/10448#issuecomment-1445378784

   And another issue is to support additional filter in the query. The only property is `serviceID` now?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] lujiajing1126 commented on a diff in pull request #10448: Integrate BanyanDB server-side TopN

Posted by "lujiajing1126 (via GitHub)" <gi...@apache.org>.
lujiajing1126 commented on code in PR #10448:
URL: https://github.com/apache/skywalking/pull/10448#discussion_r1118953797


##########
oap-server/server-storage-plugin/storage-banyandb-plugin/src/main/java/org/apache/skywalking/oap/server/storage/plugin/banyandb/BanyanDBAggregationQueryDAO.java:
##########
@@ -85,24 +60,45 @@ protected void apply(MeasureQuery query) {
             throw new IOException("field spec is not registered");
         }
 
+        if (schema.getTopNSpec() == null) {
+            throw new IOException("TopN spec is registered");
+        }
+
+        TopNQueryResponse resp = null;
+        if (condition.getOrder() == Order.DES) {
+            resp = topN(schema, timestampRange, condition.getTopN(), additionalConditions);
+        } else {
+            resp = bottomN(schema, timestampRange, condition.getTopN(), additionalConditions);
+        }
+
+        if (resp.size() == 0) {
+            return Collections.emptyList();
+        } else if (resp.size() > 1) { // since we have done aggregation, i.e. MEAN

Review Comment:
   For a time range query, we will get a series of time windows. For example, when we are querying for the last 30mins, we may get 30 windows provided that the interval of the metric is 1min.
   
   And each time window (1 min) contains a fully ordered list, the maximum number of entries is controlled by the `countersNum` given in the TopNAggregation schema.
   
   So the (final) MEAN aggreation is imposed on all 30 windows, and gives the average value of the same key (groupByTagValues). After all these operations, we get a single result set ranked by these mean values.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] lujiajing1126 commented on a diff in pull request #10448: Integrate BanyanDB server-side TopN

Posted by "lujiajing1126 (via GitHub)" <gi...@apache.org>.
lujiajing1126 commented on code in PR #10448:
URL: https://github.com/apache/skywalking/pull/10448#discussion_r1118877729


##########
oap-server/server-storage-plugin/storage-banyandb-plugin/src/main/java/org/apache/skywalking/oap/server/storage/plugin/banyandb/MetadataRegistry.java:
##########
@@ -173,24 +172,31 @@ public Measure registerMeasureModel(Model model, BanyanDBStorageConfig config, C
             schemaBuilder.field(field.getName());
         }
         // parse TopN
-        schemaBuilder.topNSpec(parseTopNSpec(model, schemaMetadata.name()));
+        schemaBuilder.topNSpec(parseTopNSpec(model, schemaMetadata.name(), tagsAndFields));
 
         registry.put(schemaMetadata.name(), schemaBuilder.build());
         return builder.build();
     }
 
-    private TopNSpec parseTopNSpec(final Model model, final String measureName) {
-        final String valueCName = ValueColumnMetadata.INSTANCE.getValueCName(model.getName());
-        if (StringUtil.isEmpty(valueCName)) {
+    private TopNSpec parseTopNSpec(final Model model, final String measureName, MeasureMetadata tagsAndFields) {
+        final Optional<ValueColumnMetadata.ValueColumn> valueColumnOpt = ValueColumnMetadata.INSTANCE.readValueColumnDefinition(model.getName());
+        if (valueColumnOpt.isEmpty()) {
             return null;
         }
-        // TODO: how to configure parameters?
+
+        List<String> groupByTagNames = new ArrayList<>();
+        groupByTagNames.add(Metrics.ENTITY_ID);
+        for (final TagMetadata tagSpec : tagsAndFields.tags) {
+            if (tagSpec.getTagSpec().getTagName().equals(InstanceTraffic.SERVICE_ID)) {
+                groupByTagNames.add(InstanceTraffic.SERVICE_ID);

Review Comment:
   > All single value metrics, I think. We don't support labeled value metrics for topN query.
   > 
   > That is why the manual setup is better than this hidden logic.
   > 
   > BTW, the top-N query support includes two cases, w/ or w/o parent service ID. Usually instance Top-N should have service ID but it is not required from protocol level. Also, service metric has no parent for topN
   
   I've added `@TopNAggregation` for
   
   - OAL: dynamically annotate `@TopNAggregation` for single-valued metrics,
   - MAL: only annotate `@MeterFunction`s that generate single value
   
   I'm not familiar with the metrics/meter system. I am not sure if this is correct.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] lujiajing1126 commented on a diff in pull request #10448: Integrate BanyanDB server-side TopN

Posted by "lujiajing1126 (via GitHub)" <gi...@apache.org>.
lujiajing1126 commented on code in PR #10448:
URL: https://github.com/apache/skywalking/pull/10448#discussion_r1175921850


##########
oap-server-bom/pom.xml:
##########
@@ -72,7 +72,7 @@
         <awaitility.version>3.0.0</awaitility.version>
         <httpcore.version>4.4.13</httpcore.version>
         <commons-compress.version>1.21</commons-compress.version>
-        <banyandb-java-client.version>0.4.0-SNAPSHOT</banyandb-java-client.version>
+        <banyandb-java-client.version>0.4.0-rc0</banyandb-java-client.version>

Review Comment:
   > Is the rc0 already out on maven central?
   
   Yes. I've released last night.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng commented on pull request #10448: Integrate BanyanDB server-side TopN

Posted by "wu-sheng (via GitHub)" <gi...@apache.org>.
wu-sheng commented on PR #10448:
URL: https://github.com/apache/skywalking/pull/10448#issuecomment-1510204522

   Based on a discussion with @hanahmily, please open this for `Endpoint` and `BrowserAppPagePerf` sources only. All other sources and MAL functions should skip this as they don't have so many candidates for our use cases.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng commented on a diff in pull request #10448: Integrate BanyanDB server-side TopN

Posted by "wu-sheng (via GitHub)" <gi...@apache.org>.
wu-sheng commented on code in PR #10448:
URL: https://github.com/apache/skywalking/pull/10448#discussion_r1167562281


##########
oap-server/oal-rt/src/main/java/org/apache/skywalking/oal/rt/parser/SourceColumn.java:
##########
@@ -34,8 +34,10 @@ public class SourceColumn {
     private int length;
     private String fieldSetter;
     private String fieldGetter;
+    private final boolean groupByColInTopN;
 
-    public SourceColumn(String fieldName, String columnName, Class<?> type, boolean isID, int length) {
+    public SourceColumn(String fieldName, String columnName, Class<?> type, boolean isID, int length,
+                        boolean groupByColInTopN) {

Review Comment:
   Is this `Col` meaning `Column`? I think this is duplicated as this class is column. My proposal was `Cond` for `Condition`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng commented on a diff in pull request #10448: Integrate BanyanDB server-side TopN

Posted by "wu-sheng (via GitHub)" <gi...@apache.org>.
wu-sheng commented on code in PR #10448:
URL: https://github.com/apache/skywalking/pull/10448#discussion_r1118913759


##########
oap-server/server-storage-plugin/storage-banyandb-plugin/src/main/java/org/apache/skywalking/oap/server/storage/plugin/banyandb/BanyanDBAggregationQueryDAO.java:
##########
@@ -85,24 +60,45 @@ protected void apply(MeasureQuery query) {
             throw new IOException("field spec is not registered");
         }
 
+        if (schema.getTopNSpec() == null) {
+            throw new IOException("TopN spec is registered");
+        }
+
+        TopNQueryResponse resp = null;
+        if (condition.getOrder() == Order.DES) {
+            resp = topN(schema, timestampRange, condition.getTopN(), additionalConditions);
+        } else {
+            resp = bottomN(schema, timestampRange, condition.getTopN(), additionalConditions);
+        }

Review Comment:
   I think w/ and w/o service ID mean different pre-ordered at BanyanDB side for pre-calculation.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] lujiajing1126 commented on a diff in pull request #10448: Integrate BanyanDB server-side TopN

Posted by "lujiajing1126 (via GitHub)" <gi...@apache.org>.
lujiajing1126 commented on code in PR #10448:
URL: https://github.com/apache/skywalking/pull/10448#discussion_r1118877729


##########
oap-server/server-storage-plugin/storage-banyandb-plugin/src/main/java/org/apache/skywalking/oap/server/storage/plugin/banyandb/MetadataRegistry.java:
##########
@@ -173,24 +172,31 @@ public Measure registerMeasureModel(Model model, BanyanDBStorageConfig config, C
             schemaBuilder.field(field.getName());
         }
         // parse TopN
-        schemaBuilder.topNSpec(parseTopNSpec(model, schemaMetadata.name()));
+        schemaBuilder.topNSpec(parseTopNSpec(model, schemaMetadata.name(), tagsAndFields));
 
         registry.put(schemaMetadata.name(), schemaBuilder.build());
         return builder.build();
     }
 
-    private TopNSpec parseTopNSpec(final Model model, final String measureName) {
-        final String valueCName = ValueColumnMetadata.INSTANCE.getValueCName(model.getName());
-        if (StringUtil.isEmpty(valueCName)) {
+    private TopNSpec parseTopNSpec(final Model model, final String measureName, MeasureMetadata tagsAndFields) {
+        final Optional<ValueColumnMetadata.ValueColumn> valueColumnOpt = ValueColumnMetadata.INSTANCE.readValueColumnDefinition(model.getName());
+        if (valueColumnOpt.isEmpty()) {
             return null;
         }
-        // TODO: how to configure parameters?
+
+        List<String> groupByTagNames = new ArrayList<>();
+        groupByTagNames.add(Metrics.ENTITY_ID);
+        for (final TagMetadata tagSpec : tagsAndFields.tags) {
+            if (tagSpec.getTagSpec().getTagName().equals(InstanceTraffic.SERVICE_ID)) {
+                groupByTagNames.add(InstanceTraffic.SERVICE_ID);

Review Comment:
   > All single value metrics, I think. We don't support labeled value metrics for topN query.
   > 
   > That is why the manual setup is better than this hidden logic.
   > 
   > BTW, the top-N query support includes two cases, w/ or w/o parent service ID. Usually instance Top-N should have service ID but it is not required from protocol level. Also, service metric has no parent for topN
   
   I've added `@TopNAggregation` for
   
   - OAL: dynamically annotate `@TopNAggregation` for single-valued metrics,
   - MAL: only annotate `@MeterFunction`s that generate single value
   
   I'm not familiar with the metrics/meter system. I am not sure if this is correct. It works locally



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] hanahmily commented on pull request #10448: Integrate BanyanDB server-side TopN

Posted by "hanahmily (via GitHub)" <gi...@apache.org>.
hanahmily commented on PR #10448:
URL: https://github.com/apache/skywalking/pull/10448#issuecomment-1475247614

   @lujiajing1126 Would you pls move forward since https://github.com/apache/skywalking-banyandb/pull/257 is merged?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng commented on a diff in pull request #10448: Integrate BanyanDB server-side TopN

Posted by "wu-sheng (via GitHub)" <gi...@apache.org>.
wu-sheng commented on code in PR #10448:
URL: https://github.com/apache/skywalking/pull/10448#discussion_r1174687593


##########
oap-server/server-storage-plugin/storage-banyandb-plugin/src/main/java/org/apache/skywalking/oap/server/storage/plugin/banyandb/BanyanDBAggregationQueryDAO.java:
##########
@@ -64,6 +63,13 @@ public List<SelectedRecord> sortMetrics(TopNCondition condition, String valueCol
 
         // BanyanDB server-side TopN support for metrics pre-aggregation.
         if (schema.getTopNSpec() != null) {
+            // check if additional conditions are registered
+            if (CollectionUtils.isNotEmpty(additionalConditions)) {
+                final Set<String> additionalConditionNames = additionalConditions.stream().map(KeyValue::getKey).collect(Collectors.toSet());
+                if (!ImmutableSet.copyOf(schema.getTopNSpec().getGroupByTagNames()).containsAll(additionalConditionNames)) {
+                    throw new IOException("[" + Strings.join(additionalConditionNames, ',') + "] conditions are not registered as groupBy tags");

Review Comment:
   Two questions,
   1. I think group-by pre-aggregation should be `equal` or `no condition`, rather than `contains`? I know that we have one condition, so, contains means exactly the same. But I am trying to be precise. 
   2. When this pre-aggregation doesn't fit the current query, I think failing back to on-demand group-by still works. We don't have to through exception, right?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] lujiajing1126 commented on a diff in pull request #10448: Integrate BanyanDB server-side TopN

Posted by "lujiajing1126 (via GitHub)" <gi...@apache.org>.
lujiajing1126 commented on code in PR #10448:
URL: https://github.com/apache/skywalking/pull/10448#discussion_r1118955396


##########
oap-server/server-storage-plugin/storage-banyandb-plugin/src/main/java/org/apache/skywalking/oap/server/storage/plugin/banyandb/BanyanDBAggregationQueryDAO.java:
##########
@@ -85,24 +60,45 @@ protected void apply(MeasureQuery query) {
             throw new IOException("field spec is not registered");
         }
 
+        if (schema.getTopNSpec() == null) {
+            throw new IOException("TopN spec is registered");
+        }
+
+        TopNQueryResponse resp = null;
+        if (condition.getOrder() == Order.DES) {
+            resp = topN(schema, timestampRange, condition.getTopN(), additionalConditions);
+        } else {
+            resp = bottomN(schema, timestampRange, condition.getTopN(), additionalConditions);
+        }

Review Comment:
   > Let's say there are 200 services, each service has 50k endpoints, we need both service-based topN endpoints and global level topN endpoints. How do we make this work? If we only set the rule as top 1000, which are all belonging to several services. This means we can't have topN for other services.
   > 
   > @lujiajing1126 @hanahmily What are your suggestions for this real case?
   
   Yes. It is possible. Even two TopN(s) cannot help in this case.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] lujiajing1126 commented on a diff in pull request #10448: Integrate BanyanDB server-side TopN

Posted by "lujiajing1126 (via GitHub)" <gi...@apache.org>.
lujiajing1126 commented on code in PR #10448:
URL: https://github.com/apache/skywalking/pull/10448#discussion_r1118877729


##########
oap-server/server-storage-plugin/storage-banyandb-plugin/src/main/java/org/apache/skywalking/oap/server/storage/plugin/banyandb/MetadataRegistry.java:
##########
@@ -173,24 +172,31 @@ public Measure registerMeasureModel(Model model, BanyanDBStorageConfig config, C
             schemaBuilder.field(field.getName());
         }
         // parse TopN
-        schemaBuilder.topNSpec(parseTopNSpec(model, schemaMetadata.name()));
+        schemaBuilder.topNSpec(parseTopNSpec(model, schemaMetadata.name(), tagsAndFields));
 
         registry.put(schemaMetadata.name(), schemaBuilder.build());
         return builder.build();
     }
 
-    private TopNSpec parseTopNSpec(final Model model, final String measureName) {
-        final String valueCName = ValueColumnMetadata.INSTANCE.getValueCName(model.getName());
-        if (StringUtil.isEmpty(valueCName)) {
+    private TopNSpec parseTopNSpec(final Model model, final String measureName, MeasureMetadata tagsAndFields) {
+        final Optional<ValueColumnMetadata.ValueColumn> valueColumnOpt = ValueColumnMetadata.INSTANCE.readValueColumnDefinition(model.getName());
+        if (valueColumnOpt.isEmpty()) {
             return null;
         }
-        // TODO: how to configure parameters?
+
+        List<String> groupByTagNames = new ArrayList<>();
+        groupByTagNames.add(Metrics.ENTITY_ID);
+        for (final TagMetadata tagSpec : tagsAndFields.tags) {
+            if (tagSpec.getTagSpec().getTagName().equals(InstanceTraffic.SERVICE_ID)) {
+                groupByTagNames.add(InstanceTraffic.SERVICE_ID);

Review Comment:
   > All single value metrics, I think. We don't support labeled value metrics for topN query.
   > 
   > That is why the manual setup is better than this hidden logic.
   > 
   > BTW, the top-N query support includes two cases, w/ or w/o parent service ID. Usually instance Top-N should have service ID but it is not required from protocol level. Also, service metric has no parent for topN
   
   I've added `@TopNAggregation` for
   
   - OAL: dynamically annotate `@TopNAggregation` for single-valued metrics,
   - MAL: only annotate `@MeterFunction`s that generate single value



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng commented on a diff in pull request #10448: Integrate BanyanDB server-side TopN

Posted by "wu-sheng (via GitHub)" <gi...@apache.org>.
wu-sheng commented on code in PR #10448:
URL: https://github.com/apache/skywalking/pull/10448#discussion_r1118104454


##########
oap-server/server-core/src/main/java/org/apache/skywalking/oap/server/core/storage/model/BanyanDBModelExtension.java:
##########
@@ -45,4 +46,23 @@ public class BanyanDBModelExtension {
     @Setter
     private boolean storeIDTag;
 
+    /**
+     * lru_size defines how many snapshots are allowed to be maintained in the memory.
+     * The default value is 2 in the BanyanDB if not set.
+     *
+     * @since 9.4.0
+     */
+    @Getter
+    @Setter
+    private int lruSize = 2;

Review Comment:
   Could you explain a little more about lru_size? How does this impact?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] lujiajing1126 commented on a diff in pull request #10448: Integrate BanyanDB server-side TopN

Posted by "lujiajing1126 (via GitHub)" <gi...@apache.org>.
lujiajing1126 commented on code in PR #10448:
URL: https://github.com/apache/skywalking/pull/10448#discussion_r1118731890


##########
oap-server/server-storage-plugin/storage-banyandb-plugin/src/main/java/org/apache/skywalking/oap/server/storage/plugin/banyandb/MetadataRegistry.java:
##########
@@ -173,24 +172,31 @@ public Measure registerMeasureModel(Model model, BanyanDBStorageConfig config, C
             schemaBuilder.field(field.getName());
         }
         // parse TopN
-        schemaBuilder.topNSpec(parseTopNSpec(model, schemaMetadata.name()));
+        schemaBuilder.topNSpec(parseTopNSpec(model, schemaMetadata.name(), tagsAndFields));
 
         registry.put(schemaMetadata.name(), schemaBuilder.build());
         return builder.build();
     }
 
-    private TopNSpec parseTopNSpec(final Model model, final String measureName) {
-        final String valueCName = ValueColumnMetadata.INSTANCE.getValueCName(model.getName());
-        if (StringUtil.isEmpty(valueCName)) {
+    private TopNSpec parseTopNSpec(final Model model, final String measureName, MeasureMetadata tagsAndFields) {
+        final Optional<ValueColumnMetadata.ValueColumn> valueColumnOpt = ValueColumnMetadata.INSTANCE.readValueColumnDefinition(model.getName());
+        if (valueColumnOpt.isEmpty()) {
             return null;
         }
-        // TODO: how to configure parameters?
+
+        List<String> groupByTagNames = new ArrayList<>();
+        groupByTagNames.add(Metrics.ENTITY_ID);
+        for (final TagMetadata tagSpec : tagsAndFields.tags) {
+            if (tagSpec.getTagSpec().getTagName().equals(InstanceTraffic.SERVICE_ID)) {
+                groupByTagNames.add(InstanceTraffic.SERVICE_ID);

Review Comment:
   If we add an array to `TopNAggregation`, do we need to annotate all Metrics that should have TopN support?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] lujiajing1126 commented on a diff in pull request #10448: Integrate BanyanDB server-side TopN

Posted by "lujiajing1126 (via GitHub)" <gi...@apache.org>.
lujiajing1126 commented on code in PR #10448:
URL: https://github.com/apache/skywalking/pull/10448#discussion_r1118955396


##########
oap-server/server-storage-plugin/storage-banyandb-plugin/src/main/java/org/apache/skywalking/oap/server/storage/plugin/banyandb/BanyanDBAggregationQueryDAO.java:
##########
@@ -85,24 +60,45 @@ protected void apply(MeasureQuery query) {
             throw new IOException("field spec is not registered");
         }
 
+        if (schema.getTopNSpec() == null) {
+            throw new IOException("TopN spec is registered");
+        }
+
+        TopNQueryResponse resp = null;
+        if (condition.getOrder() == Order.DES) {
+            resp = topN(schema, timestampRange, condition.getTopN(), additionalConditions);
+        } else {
+            resp = bottomN(schema, timestampRange, condition.getTopN(), additionalConditions);
+        }

Review Comment:
   > Let's say there are 200 services, each service has 50k endpoints, we need both service-based topN endpoints and global level topN endpoints. How do we make this work? If we only set the rule as top 1000, which are all belonging to several services. This means we can't have topN for other services.
   > 
   > @lujiajing1126 @hanahmily What are your suggestions for this real case?
   
   Yes. It is possible. Even two TopN(s) or more cannot help in this case.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng commented on a diff in pull request #10448: Integrate BanyanDB server-side TopN

Posted by "wu-sheng (via GitHub)" <gi...@apache.org>.
wu-sheng commented on code in PR #10448:
URL: https://github.com/apache/skywalking/pull/10448#discussion_r1118953641


##########
oap-server/server-storage-plugin/storage-banyandb-plugin/src/main/java/org/apache/skywalking/oap/server/storage/plugin/banyandb/BanyanDBAggregationQueryDAO.java:
##########
@@ -85,24 +60,45 @@ protected void apply(MeasureQuery query) {
             throw new IOException("field spec is not registered");
         }
 
+        if (schema.getTopNSpec() == null) {
+            throw new IOException("TopN spec is registered");
+        }
+
+        TopNQueryResponse resp = null;
+        if (condition.getOrder() == Order.DES) {
+            resp = topN(schema, timestampRange, condition.getTopN(), additionalConditions);
+        } else {
+            resp = bottomN(schema, timestampRange, condition.getTopN(), additionalConditions);
+        }

Review Comment:
   Let's say there are 200 services, each service has 50k endpoints, we need both service-based topN endpoints and global level topN endpoints. How do we make this work?
   If we only set the  rule as top 1000, which are all belonging to several services. This means we can't have topN for other services. 
   
   @lujiajing1126 @hanahmily What are your suggestions for this real case?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] hanahmily commented on a diff in pull request #10448: Integrate BanyanDB server-side TopN

Posted by "hanahmily (via GitHub)" <gi...@apache.org>.
hanahmily commented on code in PR #10448:
URL: https://github.com/apache/skywalking/pull/10448#discussion_r1119436555


##########
oap-server/server-storage-plugin/storage-banyandb-plugin/src/main/java/org/apache/skywalking/oap/server/storage/plugin/banyandb/BanyanDBAggregationQueryDAO.java:
##########
@@ -85,24 +60,45 @@ protected void apply(MeasureQuery query) {
             throw new IOException("field spec is not registered");
         }
 
+        if (schema.getTopNSpec() == null) {
+            throw new IOException("TopN spec is registered");
+        }
+
+        TopNQueryResponse resp = null;
+        if (condition.getOrder() == Order.DES) {
+            resp = topN(schema, timestampRange, condition.getTopN(), additionalConditions);
+        } else {
+            resp = bottomN(schema, timestampRange, condition.getTopN(), additionalConditions);
+        }

Review Comment:
   > > Let's say there are 200 services, each service has 50k endpoints, we need both service-based topN endpoints and global level topN endpoints. How do we make this work? If we only set the rule as top 1000, which are all belonging to several services. This means we can't have topN for other services.
   > 
   > > 
   > 
   > > @lujiajing1126 @hanahmily What are your suggestions for this real case?
   > 
   > 
   > 
   > Yes. It is possible. Even two TopN(s) or more cannot help in this case.
   
   From the server’s perspective, we can set a global endpoint topn process and several sub-endpoints top with a service id to support this scenario. @lujiajing1126 What’s the gap here?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] lujiajing1126 commented on a diff in pull request #10448: Integrate BanyanDB server-side TopN

Posted by "lujiajing1126 (via GitHub)" <gi...@apache.org>.
lujiajing1126 commented on code in PR #10448:
URL: https://github.com/apache/skywalking/pull/10448#discussion_r1175343126


##########
oap-server/server-storage-plugin/storage-banyandb-plugin/src/main/java/org/apache/skywalking/oap/server/storage/plugin/banyandb/BanyanDBAggregationQueryDAO.java:
##########
@@ -64,6 +63,13 @@ public List<SelectedRecord> sortMetrics(TopNCondition condition, String valueCol
 
         // BanyanDB server-side TopN support for metrics pre-aggregation.
         if (schema.getTopNSpec() != null) {
+            // check if additional conditions are registered
+            if (CollectionUtils.isNotEmpty(additionalConditions)) {
+                final Set<String> additionalConditionNames = additionalConditions.stream().map(KeyValue::getKey).collect(Collectors.toSet());
+                if (!ImmutableSet.copyOf(schema.getTopNSpec().getGroupByTagNames()).containsAll(additionalConditionNames)) {
+                    throw new IOException("[" + Strings.join(additionalConditionNames, ',') + "] conditions are not registered as groupBy tags");

Review Comment:
   You're right. Fixed



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng commented on a diff in pull request #10448: Integrate BanyanDB server-side TopN

Posted by "wu-sheng (via GitHub)" <gi...@apache.org>.
wu-sheng commented on code in PR #10448:
URL: https://github.com/apache/skywalking/pull/10448#discussion_r1117938706


##########
oap-server/server-storage-plugin/storage-banyandb-plugin/src/main/java/org/apache/skywalking/oap/server/storage/plugin/banyandb/BanyanDBAggregationQueryDAO.java:
##########
@@ -51,6 +52,43 @@ public BanyanDBAggregationQueryDAO(BanyanDBStorageClient client) {
     public List<SelectedRecord> sortMetrics(TopNCondition condition, String valueColumnName, Duration duration, List<KeyValue> additionalConditions) throws IOException {
         final String modelName = condition.getName();
         final TimestampRange timestampRange = new TimestampRange(duration.getStartTimestamp(), duration.getEndTimestamp());
+        // fast-path: BanyanDB server-side TopN support
+        MetadataRegistry.Schema schema = MetadataRegistry.INSTANCE.findMetadata(modelName, duration.getStep());
+        if (schema == null) {
+            throw new IOException("schema is not registered");
+        }
+
+        MetadataRegistry.ColumnSpec spec = schema.getSpec(valueColumnName);
+        if (spec == null) {
+            throw new IOException("field spec is not registered");
+        }
+
+        if (schema.hasTopNAggregation()) {

Review Comment:
   In which case, there is no server-side pre-calculated aggregation?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng commented on a diff in pull request #10448: Integrate BanyanDB server-side TopN

Posted by "wu-sheng (via GitHub)" <gi...@apache.org>.
wu-sheng commented on code in PR #10448:
URL: https://github.com/apache/skywalking/pull/10448#discussion_r1118105331


##########
oap-server/server-core/src/main/java/org/apache/skywalking/oap/server/core/storage/model/BanyanDBModelExtension.java:
##########
@@ -45,4 +46,23 @@ public class BanyanDBModelExtension {
     @Setter
     private boolean storeIDTag;
 
+    /**
+     * lru_size defines how many snapshots are allowed to be maintained in the memory.
+     * The default value is 2 in the BanyanDB if not set.
+     *
+     * @since 9.4.0
+     */
+    @Getter
+    @Setter
+    private int lruSize = 2;

Review Comment:
   We should have a TopN sub-class. Building this inside `BanyanDBModelExtension` is not very clear. Especially, I think all models have BanyanDBModelExtension, but not all of them have TopN definition.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng commented on a diff in pull request #10448: Integrate BanyanDB server-side TopN

Posted by "wu-sheng (via GitHub)" <gi...@apache.org>.
wu-sheng commented on code in PR #10448:
URL: https://github.com/apache/skywalking/pull/10448#discussion_r1167565862


##########
oap-server/server-storage-plugin/storage-banyandb-plugin/src/main/java/org/apache/skywalking/oap/server/storage/plugin/banyandb/BanyanDBAggregationQueryDAO.java:
##########
@@ -85,24 +59,41 @@ protected void apply(MeasureQuery query) {
             throw new IOException("field spec is not registered");
         }
 
+        if (schema.getTopNSpec() == null) {
+            throw new IOException("TopN spec is registered");
+        }
+
+        TopNQueryResponse resp = null;
+        if (condition.getOrder() == Order.DES) {
+            resp = topN(schema, timestampRange, condition.getTopN(), additionalConditions);
+        } else {
+            resp = bottomN(schema, timestampRange, condition.getTopN(), additionalConditions);
+        }
+
+        if (resp.size() == 0) {
+            return Collections.emptyList();
+        } else if (resp.size() > 1) { // since we have done aggregation, i.e. MEAN
+            throw new IOException("invalid TopN response");
+        }
+
         final List<SelectedRecord> topNList = new ArrayList<>();
-        for (DataPoint dataPoint : resp.getDataPoints()) {
+        for (TopNQueryResponse.Item item : resp.getTopNLists().get(0).getItems()) {

Review Comment:
   Could you explain why TopNList is a List(Items) nested inside another List? But meanwhile, TopNLists size is always 1.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng commented on a diff in pull request #10448: Integrate BanyanDB server-side TopN

Posted by "wu-sheng (via GitHub)" <gi...@apache.org>.
wu-sheng commented on code in PR #10448:
URL: https://github.com/apache/skywalking/pull/10448#discussion_r1172667845


##########
oap-server/server-core/src/main/java/org/apache/skywalking/oap/server/core/storage/annotation/BanyanDB.java:
##########
@@ -157,10 +161,26 @@ enum IndexType {
 
     /**
      * StoreIDTag indicates a metric store its ID as a tag for searching.
+     *
      * @since 9.4.0
      */
     @Target({ElementType.TYPE})
     @Retention(RetentionPolicy.RUNTIME)
     @interface StoreIDAsTag {
     }
+
+    /**
+     * Generate a TopN Aggregation and use the annotated column as a groupBy tag.
+     * It also contains parameters for TopNAggregation
+     *
+     * @since 9.4.0
+     */
+    @Target({ElementType.FIELD})
+    @Retention(RetentionPolicy.RUNTIME)
+    @Inherited
+    @interface TopNAggregation {
+        int lruSize() default 2;

Review Comment:
   ```suggestion
           /**
            * The size of LRU determines the max tolerant time range.
            * The data in [T - lruSize * n, T] would be accepted in the pre-aggregation process.
            * T = the current time in the current dimensionality.
            * n = interval in the current dimensionality.
            */
           int lruSize() default 2;
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng commented on a diff in pull request #10448: Integrate BanyanDB server-side TopN

Posted by "wu-sheng (via GitHub)" <gi...@apache.org>.
wu-sheng commented on code in PR #10448:
URL: https://github.com/apache/skywalking/pull/10448#discussion_r1172649849


##########
oap-server/server-core/src/main/java/org/apache/skywalking/oap/server/core/storage/annotation/BanyanDB.java:
##########
@@ -157,10 +161,26 @@ enum IndexType {
 
     /**
      * StoreIDTag indicates a metric store its ID as a tag for searching.
+     *
      * @since 9.4.0
      */
     @Target({ElementType.TYPE})
     @Retention(RetentionPolicy.RUNTIME)
     @interface StoreIDAsTag {
     }
+
+    /**
+     * Generate a TopN Aggregation and use the annotated column as a groupBy tag.
+     * It also contains parameters for TopNAggregation
+     *
+     * @since 9.4.0
+     */
+    @Target({ElementType.FIELD})
+    @Retention(RetentionPolicy.RUNTIME)
+    @Inherited
+    @interface TopNAggregation {
+        int lruSize() default 2;
+
+        int countersNumber() default 1000;

Review Comment:
   ```suggestion
           /**
            * The max size of buffer for the pre-aggregation
            */
           int countersNumber() default 1000;
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng commented on a diff in pull request #10448: Integrate BanyanDB server-side TopN

Posted by "wu-sheng (via GitHub)" <gi...@apache.org>.
wu-sheng commented on code in PR #10448:
URL: https://github.com/apache/skywalking/pull/10448#discussion_r1175924442


##########
oap-server-bom/pom.xml:
##########
@@ -72,7 +72,7 @@
         <awaitility.version>3.0.0</awaitility.version>
         <httpcore.version>4.4.13</httpcore.version>
         <commons-compress.version>1.21</commons-compress.version>
-        <banyandb-java-client.version>0.4.0-SNAPSHOT</banyandb-java-client.version>
+        <banyandb-java-client.version>0.4.0-rc0</banyandb-java-client.version>

Review Comment:
   Please make sure you sent the mail list update and other process first.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng commented on a diff in pull request #10448: Integrate BanyanDB server-side TopN

Posted by "wu-sheng (via GitHub)" <gi...@apache.org>.
wu-sheng commented on code in PR #10448:
URL: https://github.com/apache/skywalking/pull/10448#discussion_r1118899100


##########
oap-server/oal-rt/src/main/java/org/apache/skywalking/oal/rt/OALRuntime.java:
##########
@@ -315,6 +326,30 @@ private Class generateMetricsClass(AnalysisResult metricsStmt) throws OALCompile
         streamAnnotation.addMemberValue("processor", new ClassMemberValue(METRICS_STREAM_PROCESSOR, constPool));
 
         annotationsAttribute.addAnnotation(streamAnnotation);
+
+        // Add @BanyanDB.TopNAggregation if applicable
+        // 1. groupByTagName is not empty
+        // 2. parentClass implements LongValueHolder, IntValueHolder or DoubleValueHolder
+        try {
+            if (Arrays.stream(parentMetricsClass.getInterfaces())
+                    .anyMatch(i -> i.getSimpleName().equals("IntValueHolder") ||
+                            i.getSimpleName().equals("LongValueHolder") ||
+                            i.getSimpleName().equals("DoubleValueHolder"))) {
+                if (!groupByTagNames.isEmpty()) {
+                    Annotation topNAnnotation = new Annotation(BanyanDB.TopNAggregation.class.getName(), constPool);
+                    ArrayMemberValue amv = new ArrayMemberValue(constPool);
+                    MemberValue[] values = new MemberValue[groupByTagNames.size()];
+                    for (int i = 0; i < groupByTagNames.size(); ++i) {
+                        values[i] = new StringMemberValue(groupByTagNames.get(i), constPool);
+                    }
+                    amv.setValue(values);
+                    topNAnnotation.addMemberValue("groupByTagNames", amv);
+                    annotationsAttribute.addAnnotation(topNAnnotation);
+                }
+            }
+        } catch (NotFoundException ignored) {

Review Comment:
   Please don't ignore this important exception.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] lujiajing1126 commented on a diff in pull request #10448: Integrate BanyanDB server-side TopN

Posted by "lujiajing1126 (via GitHub)" <gi...@apache.org>.
lujiajing1126 commented on code in PR #10448:
URL: https://github.com/apache/skywalking/pull/10448#discussion_r1118953797


##########
oap-server/server-storage-plugin/storage-banyandb-plugin/src/main/java/org/apache/skywalking/oap/server/storage/plugin/banyandb/BanyanDBAggregationQueryDAO.java:
##########
@@ -85,24 +60,45 @@ protected void apply(MeasureQuery query) {
             throw new IOException("field spec is not registered");
         }
 
+        if (schema.getTopNSpec() == null) {
+            throw new IOException("TopN spec is registered");
+        }
+
+        TopNQueryResponse resp = null;
+        if (condition.getOrder() == Order.DES) {
+            resp = topN(schema, timestampRange, condition.getTopN(), additionalConditions);
+        } else {
+            resp = bottomN(schema, timestampRange, condition.getTopN(), additionalConditions);
+        }
+
+        if (resp.size() == 0) {
+            return Collections.emptyList();
+        } else if (resp.size() > 1) { // since we have done aggregation, i.e. MEAN

Review Comment:
   For a time range query, we will get a series of time windows. For example, when we are querying for the last 30mins, we may get 30 windows provided that the interval of the metric is 1min.
   
   And each time window (1 min) contains a fully ordered list, the maximum number of entries is controlled by the `countersNum` given in the TopNAggregation schema.
   
   So the (final) MEAN aggreation is imposed on all 30 windows, and calculate the average value of the same key (groupByTagValues). After all these operations, we get a single result set.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng commented on a diff in pull request #10448: Integrate BanyanDB server-side TopN

Posted by "wu-sheng (via GitHub)" <gi...@apache.org>.
wu-sheng commented on code in PR #10448:
URL: https://github.com/apache/skywalking/pull/10448#discussion_r1118105512


##########
oap-server/server-storage-plugin/storage-banyandb-plugin/src/main/java/org/apache/skywalking/oap/server/storage/plugin/banyandb/MetadataRegistry.java:
##########
@@ -173,24 +172,31 @@ public Measure registerMeasureModel(Model model, BanyanDBStorageConfig config, C
             schemaBuilder.field(field.getName());
         }
         // parse TopN
-        schemaBuilder.topNSpec(parseTopNSpec(model, schemaMetadata.name()));
+        schemaBuilder.topNSpec(parseTopNSpec(model, schemaMetadata.name(), tagsAndFields));
 
         registry.put(schemaMetadata.name(), schemaBuilder.build());
         return builder.build();
     }
 
-    private TopNSpec parseTopNSpec(final Model model, final String measureName) {
-        final String valueCName = ValueColumnMetadata.INSTANCE.getValueCName(model.getName());
-        if (StringUtil.isEmpty(valueCName)) {
+    private TopNSpec parseTopNSpec(final Model model, final String measureName, MeasureMetadata tagsAndFields) {
+        final Optional<ValueColumnMetadata.ValueColumn> valueColumnOpt = ValueColumnMetadata.INSTANCE.readValueColumnDefinition(model.getName());
+        if (valueColumnOpt.isEmpty()) {
             return null;
         }
-        // TODO: how to configure parameters?
+
+        List<String> groupByTagNames = new ArrayList<>();
+        groupByTagNames.add(Metrics.ENTITY_ID);
+        for (final TagMetadata tagSpec : tagsAndFields.tags) {
+            if (tagSpec.getTagSpec().getTagName().equals(InstanceTraffic.SERVICE_ID)) {
+                groupByTagNames.add(InstanceTraffic.SERVICE_ID);

Review Comment:
   How about adding an `Array` in the `TopNAggregation` annotation to define the columns? Then we could verify whether these columns are valid. Only should boot successfully after validation passed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] lujiajing1126 commented on a diff in pull request #10448: Integrate BanyanDB server-side TopN

Posted by "lujiajing1126 (via GitHub)" <gi...@apache.org>.
lujiajing1126 commented on code in PR #10448:
URL: https://github.com/apache/skywalking/pull/10448#discussion_r1118095679


##########
oap-server/server-storage-plugin/storage-banyandb-plugin/src/main/java/org/apache/skywalking/oap/server/storage/plugin/banyandb/BanyanDBAggregationQueryDAO.java:
##########
@@ -51,6 +52,43 @@ public BanyanDBAggregationQueryDAO(BanyanDBStorageClient client) {
     public List<SelectedRecord> sortMetrics(TopNCondition condition, String valueColumnName, Duration duration, List<KeyValue> additionalConditions) throws IOException {
         final String modelName = condition.getName();
         final TimestampRange timestampRange = new TimestampRange(duration.getStartTimestamp(), duration.getEndTimestamp());
+        // fast-path: BanyanDB server-side TopN support
+        MetadataRegistry.Schema schema = MetadataRegistry.INSTANCE.findMetadata(modelName, duration.getStep());
+        if (schema == null) {
+            throw new IOException("schema is not registered");
+        }
+
+        MetadataRegistry.ColumnSpec spec = schema.getSpec(valueColumnName);
+        if (spec == null) {
+            throw new IOException("field spec is not registered");
+        }
+
+        if (schema.hasTopNAggregation()) {

Review Comment:
   You are right. I was confused. The legacy code is removed 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] lujiajing1126 commented on a diff in pull request #10448: Integrate BanyanDB server-side TopN

Posted by "lujiajing1126 (via GitHub)" <gi...@apache.org>.
lujiajing1126 commented on code in PR #10448:
URL: https://github.com/apache/skywalking/pull/10448#discussion_r1118923225


##########
oap-server/server-storage-plugin/storage-banyandb-plugin/src/main/java/org/apache/skywalking/oap/server/storage/plugin/banyandb/BanyanDBAggregationQueryDAO.java:
##########
@@ -85,24 +60,45 @@ protected void apply(MeasureQuery query) {
             throw new IOException("field spec is not registered");
         }
 
+        if (schema.getTopNSpec() == null) {
+            throw new IOException("TopN spec is registered");
+        }
+
+        TopNQueryResponse resp = null;
+        if (condition.getOrder() == Order.DES) {
+            resp = topN(schema, timestampRange, condition.getTopN(), additionalConditions);
+        } else {
+            resp = bottomN(schema, timestampRange, condition.getTopN(), additionalConditions);
+        }

Review Comment:
   I think we should always add complete groupByTags to the annotation in order to make filter work.
   
   BanyanDB client can restore those groupBy tags, thus we only need one TopN processor in the database.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng commented on a diff in pull request #10448: Integrate BanyanDB server-side TopN

Posted by "wu-sheng (via GitHub)" <gi...@apache.org>.
wu-sheng commented on code in PR #10448:
URL: https://github.com/apache/skywalking/pull/10448#discussion_r1118919216


##########
oap-server/oal-rt/src/main/java/org/apache/skywalking/oal/rt/OALRuntime.java:
##########
@@ -315,6 +326,30 @@ private Class generateMetricsClass(AnalysisResult metricsStmt) throws OALCompile
         streamAnnotation.addMemberValue("processor", new ClassMemberValue(METRICS_STREAM_PROCESSOR, constPool));
 
         annotationsAttribute.addAnnotation(streamAnnotation);
+
+        // Add @BanyanDB.TopNAggregation if applicable
+        // 1. groupByTagName is not empty
+        // 2. parentClass implements LongValueHolder, IntValueHolder or DoubleValueHolder
+        try {
+            if (Arrays.stream(parentMetricsClass.getInterfaces())
+                    .anyMatch(i -> i.getSimpleName().equals("IntValueHolder") ||
+                            i.getSimpleName().equals("LongValueHolder") ||
+                            i.getSimpleName().equals("DoubleValueHolder"))) {

Review Comment:
   Oops, forgot this. Then, use the full class name, such as `LongValueHolder.class.getCanonicalName()`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] lujiajing1126 commented on a diff in pull request #10448: Integrate BanyanDB server-side TopN

Posted by "lujiajing1126 (via GitHub)" <gi...@apache.org>.
lujiajing1126 commented on code in PR #10448:
URL: https://github.com/apache/skywalking/pull/10448#discussion_r1118944320


##########
oap-server/server-storage-plugin/storage-banyandb-plugin/src/main/java/org/apache/skywalking/oap/server/storage/plugin/banyandb/BanyanDBAggregationQueryDAO.java:
##########
@@ -85,24 +60,45 @@ protected void apply(MeasureQuery query) {
             throw new IOException("field spec is not registered");
         }
 
+        if (schema.getTopNSpec() == null) {
+            throw new IOException("TopN spec is registered");
+        }
+
+        TopNQueryResponse resp = null;
+        if (condition.getOrder() == Order.DES) {
+            resp = topN(schema, timestampRange, condition.getTopN(), additionalConditions);
+        } else {
+            resp = bottomN(schema, timestampRange, condition.getTopN(), additionalConditions);
+        }

Review Comment:
   > I mean there are two kinds of topN.
   > 
   > 1. Where **service_id = xxx** and time_range in [start, end] Group by entityID
   > 2. Where and time_range in [start, end] Group by entityID
   > 
   > Check the query condition protocol. https://github.com/apache/skywalking-query-protocol/blob/450328c6445d4dc50a868a0b797a14755f1d82cc/metrics-v2.graphqls#L74-L75
   
   BanyanDB supports post-aggregation as is hinted in the protocol
   
   https://github.com/apache/skywalking-banyandb/blob/main/api/proto/banyandb/measure/v1/topn.proto#L62
   
   I suppose it can serve the first case



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng merged pull request #10448: Integrate BanyanDB server-side TopN

Posted by "wu-sheng (via GitHub)" <gi...@apache.org>.
wu-sheng merged PR #10448:
URL: https://github.com/apache/skywalking/pull/10448


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng commented on pull request #10448: Integrate BanyanDB server-side TopN

Posted by "wu-sheng (via GitHub)" <gi...@apache.org>.
wu-sheng commented on PR #10448:
URL: https://github.com/apache/skywalking/pull/10448#issuecomment-1445380385

   > Now we have to decide how we can define the parameters? May be in the annotation?
   
   I think BanyanDB's specific annotation works. 
   
   > And another issue is to support additional filter in the aggregation query. The only property is serviceID now?
   
   For now, I think only the time range, max candidates for precalculation and service ID.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng commented on a diff in pull request #10448: Integrate BanyanDB server-side TopN

Posted by "wu-sheng (via GitHub)" <gi...@apache.org>.
wu-sheng commented on code in PR #10448:
URL: https://github.com/apache/skywalking/pull/10448#discussion_r1118906645


##########
oap-server/server-storage-plugin/storage-banyandb-plugin/src/main/java/org/apache/skywalking/oap/server/storage/plugin/banyandb/BanyanDBAggregationQueryDAO.java:
##########
@@ -85,24 +60,45 @@ protected void apply(MeasureQuery query) {
             throw new IOException("field spec is not registered");
         }
 
+        if (schema.getTopNSpec() == null) {
+            throw new IOException("TopN spec is registered");
+        }
+
+        TopNQueryResponse resp = null;
+        if (condition.getOrder() == Order.DES) {
+            resp = topN(schema, timestampRange, condition.getTopN(), additionalConditions);
+        } else {
+            resp = bottomN(schema, timestampRange, condition.getTopN(), additionalConditions);
+        }
+
+        if (resp.size() == 0) {
+            return Collections.emptyList();
+        } else if (resp.size() > 1) { // since we have done aggregation, i.e. MEAN

Review Comment:
   What do you mean `MEAN` here?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng commented on a diff in pull request #10448: Integrate BanyanDB server-side TopN

Posted by "wu-sheng (via GitHub)" <gi...@apache.org>.
wu-sheng commented on code in PR #10448:
URL: https://github.com/apache/skywalking/pull/10448#discussion_r1118907876


##########
oap-server/server-storage-plugin/storage-banyandb-plugin/src/main/java/org/apache/skywalking/oap/server/storage/plugin/banyandb/BanyanDBAggregationQueryDAO.java:
##########
@@ -85,24 +60,45 @@ protected void apply(MeasureQuery query) {
             throw new IOException("field spec is not registered");
         }
 
+        if (schema.getTopNSpec() == null) {
+            throw new IOException("TopN spec is registered");
+        }
+
+        TopNQueryResponse resp = null;
+        if (condition.getOrder() == Order.DES) {
+            resp = topN(schema, timestampRange, condition.getTopN(), additionalConditions);
+        } else {
+            resp = bottomN(schema, timestampRange, condition.getTopN(), additionalConditions);
+        }

Review Comment:
   `additionalConditions` may contain `service_id` or not. Do we need two TopN annotations for two rules?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng commented on a diff in pull request #10448: Integrate BanyanDB server-side TopN

Posted by "wu-sheng (via GitHub)" <gi...@apache.org>.
wu-sheng commented on code in PR #10448:
URL: https://github.com/apache/skywalking/pull/10448#discussion_r1118757785


##########
oap-server/server-storage-plugin/storage-banyandb-plugin/src/main/java/org/apache/skywalking/oap/server/storage/plugin/banyandb/MetadataRegistry.java:
##########
@@ -173,24 +172,31 @@ public Measure registerMeasureModel(Model model, BanyanDBStorageConfig config, C
             schemaBuilder.field(field.getName());
         }
         // parse TopN
-        schemaBuilder.topNSpec(parseTopNSpec(model, schemaMetadata.name()));
+        schemaBuilder.topNSpec(parseTopNSpec(model, schemaMetadata.name(), tagsAndFields));
 
         registry.put(schemaMetadata.name(), schemaBuilder.build());
         return builder.build();
     }
 
-    private TopNSpec parseTopNSpec(final Model model, final String measureName) {
-        final String valueCName = ValueColumnMetadata.INSTANCE.getValueCName(model.getName());
-        if (StringUtil.isEmpty(valueCName)) {
+    private TopNSpec parseTopNSpec(final Model model, final String measureName, MeasureMetadata tagsAndFields) {
+        final Optional<ValueColumnMetadata.ValueColumn> valueColumnOpt = ValueColumnMetadata.INSTANCE.readValueColumnDefinition(model.getName());
+        if (valueColumnOpt.isEmpty()) {
             return null;
         }
-        // TODO: how to configure parameters?
+
+        List<String> groupByTagNames = new ArrayList<>();
+        groupByTagNames.add(Metrics.ENTITY_ID);
+        for (final TagMetadata tagSpec : tagsAndFields.tags) {
+            if (tagSpec.getTagSpec().getTagName().equals(InstanceTraffic.SERVICE_ID)) {
+                groupByTagNames.add(InstanceTraffic.SERVICE_ID);

Review Comment:
   All single value metrics, I think. We don't support labeled value metrics for topN query.
   
   That is why the manual setup is better than this hidden logic.
   
   BTW, the top-N query support includes two cases, w/ or w/o parent service ID. Usually instance Top-N should have service ID but it is not required from protocol level. Also, service metric has no parent for topN



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng commented on a diff in pull request #10448: Integrate BanyanDB server-side TopN

Posted by "wu-sheng (via GitHub)" <gi...@apache.org>.
wu-sheng commented on code in PR #10448:
URL: https://github.com/apache/skywalking/pull/10448#discussion_r1167565728


##########
oap-server/server-storage-plugin/storage-banyandb-plugin/src/main/java/org/apache/skywalking/oap/server/storage/plugin/banyandb/BanyanDBAggregationQueryDAO.java:
##########
@@ -85,24 +59,41 @@ protected void apply(MeasureQuery query) {
             throw new IOException("field spec is not registered");
         }
 
+        if (schema.getTopNSpec() == null) {
+            throw new IOException("TopN spec is registered");

Review Comment:
   @hanahmily You need to test this very carefully. For now, the error could only be thrown when the TopN query is applied. We would not be able to check whether TopN query is set or not.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] lujiajing1126 commented on a diff in pull request #10448: Integrate BanyanDB server-side TopN

Posted by "lujiajing1126 (via GitHub)" <gi...@apache.org>.
lujiajing1126 commented on code in PR #10448:
URL: https://github.com/apache/skywalking/pull/10448#discussion_r1174603337


##########
oap-server/server-storage-plugin/storage-banyandb-plugin/src/main/java/org/apache/skywalking/oap/server/storage/plugin/banyandb/BanyanDBAggregationQueryDAO.java:
##########
@@ -51,7 +52,53 @@ public BanyanDBAggregationQueryDAO(BanyanDBStorageClient client) {
     public List<SelectedRecord> sortMetrics(TopNCondition condition, String valueColumnName, Duration duration, List<KeyValue> additionalConditions) throws IOException {
         final String modelName = condition.getName();
         final TimestampRange timestampRange = new TimestampRange(duration.getStartTimestamp(), duration.getEndTimestamp());
-        MeasureQueryResponse resp = query(modelName, TAGS, Collections.singleton(valueColumnName),
+        MetadataRegistry.Schema schema = MetadataRegistry.INSTANCE.findMetadata(modelName, duration.getStep());
+        if (schema == null) {
+            throw new IOException("schema is not registered");
+        }
+
+        MetadataRegistry.ColumnSpec spec = schema.getSpec(valueColumnName);
+        if (spec == null) {
+            throw new IOException("field spec is not registered");
+        }
+
+        // BanyanDB server-side TopN support: for metrics aggregated at instance, endpoint levels
+        if (schema.getTopNSpec() != null) {
+            return serverSideTopN(condition, schema, spec, timestampRange, additionalConditions);

Review Comment:
   Fixed



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng commented on a diff in pull request #10448: Integrate BanyanDB server-side TopN

Posted by "wu-sheng (via GitHub)" <gi...@apache.org>.
wu-sheng commented on code in PR #10448:
URL: https://github.com/apache/skywalking/pull/10448#discussion_r1175921278


##########
oap-server-bom/pom.xml:
##########
@@ -72,7 +72,7 @@
         <awaitility.version>3.0.0</awaitility.version>
         <httpcore.version>4.4.13</httpcore.version>
         <commons-compress.version>1.21</commons-compress.version>
-        <banyandb-java-client.version>0.4.0-SNAPSHOT</banyandb-java-client.version>
+        <banyandb-java-client.version>0.4.0-rc0</banyandb-java-client.version>

Review Comment:
   Is the rc0 already out on maven central?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] lujiajing1126 commented on a diff in pull request #10448: Integrate BanyanDB server-side TopN

Posted by "lujiajing1126 (via GitHub)" <gi...@apache.org>.
lujiajing1126 commented on code in PR #10448:
URL: https://github.com/apache/skywalking/pull/10448#discussion_r1118953797


##########
oap-server/server-storage-plugin/storage-banyandb-plugin/src/main/java/org/apache/skywalking/oap/server/storage/plugin/banyandb/BanyanDBAggregationQueryDAO.java:
##########
@@ -85,24 +60,45 @@ protected void apply(MeasureQuery query) {
             throw new IOException("field spec is not registered");
         }
 
+        if (schema.getTopNSpec() == null) {
+            throw new IOException("TopN spec is registered");
+        }
+
+        TopNQueryResponse resp = null;
+        if (condition.getOrder() == Order.DES) {
+            resp = topN(schema, timestampRange, condition.getTopN(), additionalConditions);
+        } else {
+            resp = bottomN(schema, timestampRange, condition.getTopN(), additionalConditions);
+        }
+
+        if (resp.size() == 0) {
+            return Collections.emptyList();
+        } else if (resp.size() > 1) { // since we have done aggregation, i.e. MEAN

Review Comment:
   For a time range query, we will get a series of time windows. For example, when we are querying for the last 30mins, we may get 30 windows provided that the interval of the metric is 1min.
   
   And each time window (1 min) contains a fully ordered list, the maximum number of entries is controlled by the `countersNum` given in the TopNAggregation schema.
   
   So the (final) MEAN aggreation is imposed on all 30 windows, and gives the average value of the same key (groupByTagValues). After all these operations, we get a single result set.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng commented on a diff in pull request #10448: Integrate BanyanDB server-side TopN

Posted by "wu-sheng (via GitHub)" <gi...@apache.org>.
wu-sheng commented on code in PR #10448:
URL: https://github.com/apache/skywalking/pull/10448#discussion_r1119459445


##########
oap-server/server-storage-plugin/storage-banyandb-plugin/src/main/java/org/apache/skywalking/oap/server/storage/plugin/banyandb/BanyanDBAggregationQueryDAO.java:
##########
@@ -85,24 +60,45 @@ protected void apply(MeasureQuery query) {
             throw new IOException("field spec is not registered");
         }
 
+        if (schema.getTopNSpec() == null) {
+            throw new IOException("TopN spec is registered");
+        }
+
+        TopNQueryResponse resp = null;
+        if (condition.getOrder() == Order.DES) {
+            resp = topN(schema, timestampRange, condition.getTopN(), additionalConditions);
+        } else {
+            resp = bottomN(schema, timestampRange, condition.getTopN(), additionalConditions);
+        }

Review Comment:
   @lujiajing1126 I did a discussion with @hanahmily 
   I think the server side is clear, and the thing is you seem mixed query groupBy condition and topN pre-aggregation GroupBy condition.
   
   In the current case, we should have two TopN definition per metric
   1. Global TopN 
   2. Service-ID oriented groupBy TopN
   
   Notice, the entityID should never be involved pre-aggregation TopN. That ID is the groupBy condition in the query stage only about how to merge data cross bucket. But pre-aggregation is bucket-based.
   
   I mentioned to @hanahmily, in theory <1> could be built through all <2>'s TopN by costing some extra CPU. Considering <1> oriented query is supported but not widely used, we could enhance the server side to build <1> in the query stage only. You could discuss with Gao to see whether we could do this.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng commented on a diff in pull request #10448: Integrate BanyanDB server-side TopN

Posted by "wu-sheng (via GitHub)" <gi...@apache.org>.
wu-sheng commented on code in PR #10448:
URL: https://github.com/apache/skywalking/pull/10448#discussion_r1118967182


##########
oap-server/server-storage-plugin/storage-banyandb-plugin/src/main/java/org/apache/skywalking/oap/server/storage/plugin/banyandb/BanyanDBAggregationQueryDAO.java:
##########
@@ -85,24 +60,45 @@ protected void apply(MeasureQuery query) {
             throw new IOException("field spec is not registered");
         }
 
+        if (schema.getTopNSpec() == null) {
+            throw new IOException("TopN spec is registered");
+        }
+
+        TopNQueryResponse resp = null;
+        if (condition.getOrder() == Order.DES) {
+            resp = topN(schema, timestampRange, condition.getTopN(), additionalConditions);
+        } else {
+            resp = bottomN(schema, timestampRange, condition.getTopN(), additionalConditions);
+        }

Review Comment:
   I think server side should treat topN more complex. Such as, we have conditions(service id) as filter, put pre aggregated list according to service candidate?
   
   Anyway, we need a solution for this. Otherwise, this is a bug. In a distributed system, some services are slower than others.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng commented on a diff in pull request #10448: Integrate BanyanDB server-side TopN

Posted by "wu-sheng (via GitHub)" <gi...@apache.org>.
wu-sheng commented on code in PR #10448:
URL: https://github.com/apache/skywalking/pull/10448#discussion_r1118915600


##########
oap-server/server-storage-plugin/storage-banyandb-plugin/src/main/java/org/apache/skywalking/oap/server/storage/plugin/banyandb/BanyanDBAggregationQueryDAO.java:
##########
@@ -85,24 +60,45 @@ protected void apply(MeasureQuery query) {
             throw new IOException("field spec is not registered");
         }
 
+        if (schema.getTopNSpec() == null) {
+            throw new IOException("TopN spec is registered");
+        }
+
+        TopNQueryResponse resp = null;
+        if (condition.getOrder() == Order.DES) {
+            resp = topN(schema, timestampRange, condition.getTopN(), additionalConditions);
+        } else {
+            resp = bottomN(schema, timestampRange, condition.getTopN(), additionalConditions);
+        }

Review Comment:
   If we need two annotations at the class level, we need another annotation(syntactic sugar of TopN)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng commented on a diff in pull request #10448: Integrate BanyanDB server-side TopN

Posted by "wu-sheng (via GitHub)" <gi...@apache.org>.
wu-sheng commented on code in PR #10448:
URL: https://github.com/apache/skywalking/pull/10448#discussion_r1119438687


##########
oap-server/server-storage-plugin/storage-banyandb-plugin/src/main/java/org/apache/skywalking/oap/server/storage/plugin/banyandb/BanyanDBAggregationQueryDAO.java:
##########
@@ -85,24 +60,45 @@ protected void apply(MeasureQuery query) {
             throw new IOException("field spec is not registered");
         }
 
+        if (schema.getTopNSpec() == null) {
+            throw new IOException("TopN spec is registered");
+        }
+
+        TopNQueryResponse resp = null;
+        if (condition.getOrder() == Order.DES) {
+            resp = topN(schema, timestampRange, condition.getTopN(), additionalConditions);
+        } else {
+            resp = bottomN(schema, timestampRange, condition.getTopN(), additionalConditions);
+        }

Review Comment:
   How to define that? The service ID is not predictable. We just know service ID should be used as `group by` condition.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng commented on a diff in pull request #10448: Integrate BanyanDB server-side TopN

Posted by "wu-sheng (via GitHub)" <gi...@apache.org>.
wu-sheng commented on code in PR #10448:
URL: https://github.com/apache/skywalking/pull/10448#discussion_r1174577087


##########
oap-server/server-storage-plugin/storage-banyandb-plugin/src/main/java/org/apache/skywalking/oap/server/storage/plugin/banyandb/BanyanDBAggregationQueryDAO.java:
##########
@@ -51,7 +52,53 @@ public BanyanDBAggregationQueryDAO(BanyanDBStorageClient client) {
     public List<SelectedRecord> sortMetrics(TopNCondition condition, String valueColumnName, Duration duration, List<KeyValue> additionalConditions) throws IOException {
         final String modelName = condition.getName();
         final TimestampRange timestampRange = new TimestampRange(duration.getStartTimestamp(), duration.getEndTimestamp());
-        MeasureQueryResponse resp = query(modelName, TAGS, Collections.singleton(valueColumnName),
+        MetadataRegistry.Schema schema = MetadataRegistry.INSTANCE.findMetadata(modelName, duration.getStep());
+        if (schema == null) {
+            throw new IOException("schema is not registered");
+        }
+
+        MetadataRegistry.ColumnSpec spec = schema.getSpec(valueColumnName);
+        if (spec == null) {
+            throw new IOException("field spec is not registered");
+        }
+
+        // BanyanDB server-side TopN support: for metrics aggregated at instance, endpoint levels
+        if (schema.getTopNSpec() != null) {
+            return serverSideTopN(condition, schema, spec, timestampRange, additionalConditions);

Review Comment:
   I think we should check the keys(additionalConditions#key) is declared as  `groupByTagNames`.
   
   Otherwise, we may face unexpected query result, right?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng commented on a diff in pull request #10448: Integrate BanyanDB server-side TopN

Posted by "wu-sheng (via GitHub)" <gi...@apache.org>.
wu-sheng commented on code in PR #10448:
URL: https://github.com/apache/skywalking/pull/10448#discussion_r1117940592


##########
oap-server/server-storage-plugin/storage-banyandb-plugin/src/main/java/org/apache/skywalking/oap/server/storage/plugin/banyandb/BanyanDBAggregationQueryDAO.java:
##########
@@ -51,6 +52,43 @@ public BanyanDBAggregationQueryDAO(BanyanDBStorageClient client) {
     public List<SelectedRecord> sortMetrics(TopNCondition condition, String valueColumnName, Duration duration, List<KeyValue> additionalConditions) throws IOException {
         final String modelName = condition.getName();
         final TimestampRange timestampRange = new TimestampRange(duration.getStartTimestamp(), duration.getEndTimestamp());
+        // fast-path: BanyanDB server-side TopN support
+        MetadataRegistry.Schema schema = MetadataRegistry.INSTANCE.findMetadata(modelName, duration.getStep());
+        if (schema == null) {
+            throw new IOException("schema is not registered");
+        }
+
+        MetadataRegistry.ColumnSpec spec = schema.getSpec(valueColumnName);
+        if (spec == null) {
+            throw new IOException("field spec is not registered");
+        }
+
+        if (schema.hasTopNAggregation()) {

Review Comment:
   Such as? I think metric type has locked those metrics in single value type only.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng commented on a diff in pull request #10448: Integrate BanyanDB server-side TopN

Posted by "wu-sheng (via GitHub)" <gi...@apache.org>.
wu-sheng commented on code in PR #10448:
URL: https://github.com/apache/skywalking/pull/10448#discussion_r1119472695


##########
oap-server/server-storage-plugin/storage-banyandb-plugin/src/main/java/org/apache/skywalking/oap/server/storage/plugin/banyandb/BanyanDBAggregationQueryDAO.java:
##########
@@ -85,24 +60,45 @@ protected void apply(MeasureQuery query) {
             throw new IOException("field spec is not registered");
         }
 
+        if (schema.getTopNSpec() == null) {
+            throw new IOException("TopN spec is registered");
+        }
+
+        TopNQueryResponse resp = null;
+        if (condition.getOrder() == Order.DES) {
+            resp = topN(schema, timestampRange, condition.getTopN(), additionalConditions);
+        } else {
+            resp = bottomN(schema, timestampRange, condition.getTopN(), additionalConditions);
+        }

Review Comment:
   So about the annotation, we could keep  `@BanyanDB.TopNAggregation(groupByTagNames = {"SERVICE_ID"})` for all OAL and MAL function, and let `groupByTagNames` accepts the column not existing. Then we even dont' need to update OALRuntime, but keep the annotation on the function(OAL and MAL) only.
   WDYT? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] hanahmily commented on a diff in pull request #10448: Integrate BanyanDB server-side TopN

Posted by "hanahmily (via GitHub)" <gi...@apache.org>.
hanahmily commented on code in PR #10448:
URL: https://github.com/apache/skywalking/pull/10448#discussion_r1118174845


##########
oap-server/server-core/src/main/java/org/apache/skywalking/oap/server/core/storage/model/BanyanDBModelExtension.java:
##########
@@ -45,4 +46,23 @@ public class BanyanDBModelExtension {
     @Setter
     private boolean storeIDTag;
 
+    /**
+     * lru_size defines how many snapshots are allowed to be maintained in the memory.
+     * The default value is 2 in the BanyanDB if not set.
+     *
+     * @since 9.4.0
+     */
+    @Getter
+    @Setter
+    private int lruSize = 2;

Review Comment:
   https://github.com/apache/skywalking-banyandb/blob/ad6a81c726640c8ffb8805adac4f43ffa5b2c423/api/proto/banyandb/database/v1/schema.proto#L138.
   
   It determines how many time_buckets are held. `2` means a data point belonging to the latest 2 time_buckets will be analyzed. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] lujiajing1126 commented on a diff in pull request #10448: Integrate BanyanDB server-side TopN

Posted by "lujiajing1126 (via GitHub)" <gi...@apache.org>.
lujiajing1126 commented on code in PR #10448:
URL: https://github.com/apache/skywalking/pull/10448#discussion_r1117939709


##########
oap-server/server-storage-plugin/storage-banyandb-plugin/src/main/java/org/apache/skywalking/oap/server/storage/plugin/banyandb/BanyanDBAggregationQueryDAO.java:
##########
@@ -51,6 +52,43 @@ public BanyanDBAggregationQueryDAO(BanyanDBStorageClient client) {
     public List<SelectedRecord> sortMetrics(TopNCondition condition, String valueColumnName, Duration duration, List<KeyValue> additionalConditions) throws IOException {
         final String modelName = condition.getName();
         final TimestampRange timestampRange = new TimestampRange(duration.getStartTimestamp(), duration.getEndTimestamp());
+        // fast-path: BanyanDB server-side TopN support
+        MetadataRegistry.Schema schema = MetadataRegistry.INSTANCE.findMetadata(modelName, duration.getStep());
+        if (schema == null) {
+            throw new IOException("schema is not registered");
+        }
+
+        MetadataRegistry.ColumnSpec spec = schema.getSpec(valueColumnName);
+        if (spec == null) {
+            throw new IOException("field spec is not registered");
+        }
+
+        if (schema.hasTopNAggregation()) {

Review Comment:
   > In which case, there is no server-side pre-calculated aggregation?
   
   For those metrics that do not have field.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] lujiajing1126 commented on a diff in pull request #10448: Integrate BanyanDB server-side TopN

Posted by "lujiajing1126 (via GitHub)" <gi...@apache.org>.
lujiajing1126 commented on code in PR #10448:
URL: https://github.com/apache/skywalking/pull/10448#discussion_r1118923225


##########
oap-server/server-storage-plugin/storage-banyandb-plugin/src/main/java/org/apache/skywalking/oap/server/storage/plugin/banyandb/BanyanDBAggregationQueryDAO.java:
##########
@@ -85,24 +60,45 @@ protected void apply(MeasureQuery query) {
             throw new IOException("field spec is not registered");
         }
 
+        if (schema.getTopNSpec() == null) {
+            throw new IOException("TopN spec is registered");
+        }
+
+        TopNQueryResponse resp = null;
+        if (condition.getOrder() == Order.DES) {
+            resp = topN(schema, timestampRange, condition.getTopN(), additionalConditions);
+        } else {
+            resp = bottomN(schema, timestampRange, condition.getTopN(), additionalConditions);
+        }

Review Comment:
   I think we should always add complete groupByTags to the annotation in order to make filter works.
   
   BanyanDB client can restore those groupBy tags, thus we only need one TopN processor in the database.



##########
oap-server/server-storage-plugin/storage-banyandb-plugin/src/main/java/org/apache/skywalking/oap/server/storage/plugin/banyandb/BanyanDBAggregationQueryDAO.java:
##########
@@ -85,24 +60,45 @@ protected void apply(MeasureQuery query) {
             throw new IOException("field spec is not registered");
         }
 
+        if (schema.getTopNSpec() == null) {
+            throw new IOException("TopN spec is registered");
+        }
+
+        TopNQueryResponse resp = null;
+        if (condition.getOrder() == Order.DES) {
+            resp = topN(schema, timestampRange, condition.getTopN(), additionalConditions);
+        } else {
+            resp = bottomN(schema, timestampRange, condition.getTopN(), additionalConditions);
+        }
+
+        if (resp.size() == 0) {
+            return Collections.emptyList();
+        } else if (resp.size() > 1) { // since we have done aggregation, i.e. MEAN

Review Comment:
   Aggregation by avg



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng commented on a diff in pull request #10448: Integrate BanyanDB server-side TopN

Posted by "wu-sheng (via GitHub)" <gi...@apache.org>.
wu-sheng commented on code in PR #10448:
URL: https://github.com/apache/skywalking/pull/10448#discussion_r1118933586


##########
oap-server/server-storage-plugin/storage-banyandb-plugin/src/main/java/org/apache/skywalking/oap/server/storage/plugin/banyandb/BanyanDBAggregationQueryDAO.java:
##########
@@ -85,24 +60,45 @@ protected void apply(MeasureQuery query) {
             throw new IOException("field spec is not registered");
         }
 
+        if (schema.getTopNSpec() == null) {
+            throw new IOException("TopN spec is registered");
+        }
+
+        TopNQueryResponse resp = null;
+        if (condition.getOrder() == Order.DES) {
+            resp = topN(schema, timestampRange, condition.getTopN(), additionalConditions);
+        } else {
+            resp = bottomN(schema, timestampRange, condition.getTopN(), additionalConditions);
+        }
+
+        if (resp.size() == 0) {
+            return Collections.emptyList();
+        } else if (resp.size() > 1) { // since we have done aggregation, i.e. MEAN

Review Comment:
   We are asking for a TopN(with limit). Why it should be `size==1` only? I am confused.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] lujiajing1126 commented on a diff in pull request #10448: Integrate BanyanDB server-side TopN

Posted by "lujiajing1126 (via GitHub)" <gi...@apache.org>.
lujiajing1126 commented on code in PR #10448:
URL: https://github.com/apache/skywalking/pull/10448#discussion_r1118880408


##########
oap-server/server-core/src/main/java/org/apache/skywalking/oap/server/core/storage/model/BanyanDBModelExtension.java:
##########
@@ -45,4 +46,23 @@ public class BanyanDBModelExtension {
     @Setter
     private boolean storeIDTag;
 
+    /**
+     * lru_size defines how many snapshots are allowed to be maintained in the memory.
+     * The default value is 2 in the BanyanDB if not set.
+     *
+     * @since 9.4.0
+     */
+    @Getter
+    @Setter
+    private int lruSize = 2;

Review Comment:
   Thanks for the explanation. I've reworded a bit for this doc



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng commented on a diff in pull request #10448: Integrate BanyanDB server-side TopN

Posted by "wu-sheng (via GitHub)" <gi...@apache.org>.
wu-sheng commented on code in PR #10448:
URL: https://github.com/apache/skywalking/pull/10448#discussion_r1118897935


##########
oap-server/oal-rt/src/main/java/org/apache/skywalking/oal/rt/OALRuntime.java:
##########
@@ -315,6 +326,30 @@ private Class generateMetricsClass(AnalysisResult metricsStmt) throws OALCompile
         streamAnnotation.addMemberValue("processor", new ClassMemberValue(METRICS_STREAM_PROCESSOR, constPool));
 
         annotationsAttribute.addAnnotation(streamAnnotation);
+
+        // Add @BanyanDB.TopNAggregation if applicable
+        // 1. groupByTagName is not empty
+        // 2. parentClass implements LongValueHolder, IntValueHolder or DoubleValueHolder
+        try {
+            if (Arrays.stream(parentMetricsClass.getInterfaces())
+                    .anyMatch(i -> i.getSimpleName().equals("IntValueHolder") ||
+                            i.getSimpleName().equals("LongValueHolder") ||
+                            i.getSimpleName().equals("DoubleValueHolder"))) {

Review Comment:
   I think `isAssignableFrom` is a better way.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] lujiajing1126 commented on a diff in pull request #10448: Integrate BanyanDB server-side TopN

Posted by "lujiajing1126 (via GitHub)" <gi...@apache.org>.
lujiajing1126 commented on code in PR #10448:
URL: https://github.com/apache/skywalking/pull/10448#discussion_r1118900586


##########
oap-server/oal-rt/src/main/java/org/apache/skywalking/oal/rt/OALRuntime.java:
##########
@@ -315,6 +326,30 @@ private Class generateMetricsClass(AnalysisResult metricsStmt) throws OALCompile
         streamAnnotation.addMemberValue("processor", new ClassMemberValue(METRICS_STREAM_PROCESSOR, constPool));
 
         annotationsAttribute.addAnnotation(streamAnnotation);
+
+        // Add @BanyanDB.TopNAggregation if applicable
+        // 1. groupByTagName is not empty
+        // 2. parentClass implements LongValueHolder, IntValueHolder or DoubleValueHolder
+        try {
+            if (Arrays.stream(parentMetricsClass.getInterfaces())
+                    .anyMatch(i -> i.getSimpleName().equals("IntValueHolder") ||
+                            i.getSimpleName().equals("LongValueHolder") ||
+                            i.getSimpleName().equals("DoubleValueHolder"))) {

Review Comment:
   For `CtClass` in Javassist, it that possible to use `isAssignableFrom`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] lujiajing1126 commented on a diff in pull request #10448: Integrate BanyanDB server-side TopN

Posted by "lujiajing1126 (via GitHub)" <gi...@apache.org>.
lujiajing1126 commented on code in PR #10448:
URL: https://github.com/apache/skywalking/pull/10448#discussion_r1167728084


##########
oap-server/server-storage-plugin/storage-banyandb-plugin/src/main/java/org/apache/skywalking/oap/server/storage/plugin/banyandb/BanyanDBAggregationQueryDAO.java:
##########
@@ -85,24 +59,41 @@ protected void apply(MeasureQuery query) {
             throw new IOException("field spec is not registered");
         }
 
+        if (schema.getTopNSpec() == null) {
+            throw new IOException("TopN spec is registered");
+        }
+
+        TopNQueryResponse resp = null;
+        if (condition.getOrder() == Order.DES) {
+            resp = topN(schema, timestampRange, condition.getTopN(), additionalConditions);
+        } else {
+            resp = bottomN(schema, timestampRange, condition.getTopN(), additionalConditions);
+        }
+
+        if (resp.size() == 0) {
+            return Collections.emptyList();
+        } else if (resp.size() > 1) { // since we have done aggregation, i.e. MEAN
+            throw new IOException("invalid TopN response");
+        }
+
         final List<SelectedRecord> topNList = new ArrayList<>();
-        for (DataPoint dataPoint : resp.getDataPoints()) {
+        for (TopNQueryResponse.Item item : resp.getTopNLists().get(0).getItems()) {

Review Comment:
   Answered here https://github.com/apache/skywalking/pull/10448#discussion_r1118953797



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng commented on a diff in pull request #10448: Integrate BanyanDB server-side TopN

Posted by "wu-sheng (via GitHub)" <gi...@apache.org>.
wu-sheng commented on code in PR #10448:
URL: https://github.com/apache/skywalking/pull/10448#discussion_r1174576647


##########
oap-server/server-storage-plugin/storage-banyandb-plugin/src/main/java/org/apache/skywalking/oap/server/storage/plugin/banyandb/BanyanDBAggregationQueryDAO.java:
##########
@@ -51,7 +52,53 @@ public BanyanDBAggregationQueryDAO(BanyanDBStorageClient client) {
     public List<SelectedRecord> sortMetrics(TopNCondition condition, String valueColumnName, Duration duration, List<KeyValue> additionalConditions) throws IOException {
         final String modelName = condition.getName();
         final TimestampRange timestampRange = new TimestampRange(duration.getStartTimestamp(), duration.getEndTimestamp());
-        MeasureQueryResponse resp = query(modelName, TAGS, Collections.singleton(valueColumnName),
+        MetadataRegistry.Schema schema = MetadataRegistry.INSTANCE.findMetadata(modelName, duration.getStep());
+        if (schema == null) {
+            throw new IOException("schema is not registered");
+        }
+
+        MetadataRegistry.ColumnSpec spec = schema.getSpec(valueColumnName);
+        if (spec == null) {
+            throw new IOException("field spec is not registered");
+        }
+
+        // BanyanDB server-side TopN support: for metrics aggregated at instance, endpoint levels

Review Comment:
   ```suggestion
           // BanyanDB server-side TopN support for metrics pre-aggregation.
   ```
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng commented on a diff in pull request #10448: Integrate BanyanDB server-side TopN

Posted by "wu-sheng (via GitHub)" <gi...@apache.org>.
wu-sheng commented on code in PR #10448:
URL: https://github.com/apache/skywalking/pull/10448#discussion_r1174687593


##########
oap-server/server-storage-plugin/storage-banyandb-plugin/src/main/java/org/apache/skywalking/oap/server/storage/plugin/banyandb/BanyanDBAggregationQueryDAO.java:
##########
@@ -64,6 +63,13 @@ public List<SelectedRecord> sortMetrics(TopNCondition condition, String valueCol
 
         // BanyanDB server-side TopN support for metrics pre-aggregation.
         if (schema.getTopNSpec() != null) {
+            // check if additional conditions are registered
+            if (CollectionUtils.isNotEmpty(additionalConditions)) {
+                final Set<String> additionalConditionNames = additionalConditions.stream().map(KeyValue::getKey).collect(Collectors.toSet());
+                if (!ImmutableSet.copyOf(schema.getTopNSpec().getGroupByTagNames()).containsAll(additionalConditionNames)) {
+                    throw new IOException("[" + Strings.join(additionalConditionNames, ',') + "] conditions are not registered as groupBy tags");

Review Comment:
   Two questions,
   1. I think group-by pre-aggregation should be `equal` or `no condition`, rather than `contains`? I know that we have one condition, so, contains means extractly the same. But I am trying to be precise. 
   2. When this pre-aggregation doesn't fit the current query, I think failing back to on-demand group-by still works. We don't have to through exception, right?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org