You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by "englefly (via GitHub)" <gi...@apache.org> on 2023/06/19 08:13:51 UTC

[GitHub] [doris] englefly opened a new pull request, #20978: [fix](nereids) compute size bug

englefly opened a new pull request, #20978:
URL: https://github.com/apache/doris/pull/20978

   ## Proposed changes
   expressions in stats.expressionToColumnStats is not the output of plan node.
   to compute output byte size, we need to get every output slot from stats.expressionToColumnStats, and sum all columnStatistic.averageSize.
   
   
   
   Issue Number: close #xxx
   
   <!--Describe your changes.-->
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at [dev@doris.apache.org](mailto:dev@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] englefly commented on pull request #20978: [fix](nereids) compute output size in correct way

Posted by "englefly (via GitHub)" <gi...@apache.org>.
englefly commented on PR #20978:
URL: https://github.com/apache/doris/pull/20978#issuecomment-1597049166

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] morrySnow commented on a diff in pull request #20978: [fix](nereids) compute output size in correct way

Posted by "morrySnow (via GitHub)" <gi...@apache.org>.
morrySnow commented on code in PR #20978:
URL: https://github.com/apache/doris/pull/20978#discussion_r1233713420


##########
fe/fe-core/src/main/java/org/apache/doris/statistics/Statistics.java:
##########
@@ -149,21 +155,28 @@ public Statistics merge(Statistics statistics) {
         return this;
     }
 
-    private double computeTupleSize() {
+    private double computeTupleSize(Plan plan) {
         if (tupleSize <= 0) {
-            tupleSize = expressionToColumnStats.values().stream()
-                    .map(s -> s.avgSizeByte).reduce(0D, Double::sum);
+            tupleSize = 0;
+            for (Slot slot : plan.getOutput()) {
+                ColumnStatistic colStats = expressionToColumnStats.get(slot);
+                if (colStats != null) {
+                    tupleSize += colStats.avgSizeByte;
+                } else {
+                    tupleSize += 8;

Review Comment:
   what's mean about 8?



##########
fe/fe-core/src/main/java/org/apache/doris/statistics/Statistics.java:
##########
@@ -149,21 +155,28 @@ public Statistics merge(Statistics statistics) {
         return this;
     }
 
-    private double computeTupleSize() {
+    private double computeTupleSize(Plan plan) {
         if (tupleSize <= 0) {
-            tupleSize = expressionToColumnStats.values().stream()
-                    .map(s -> s.avgSizeByte).reduce(0D, Double::sum);
+            tupleSize = 0;
+            for (Slot slot : plan.getOutput()) {
+                ColumnStatistic colStats = expressionToColumnStats.get(slot);

Review Comment:
   why we need do this? expressionToColumnStats do not contain and only contain all slot of plan's output?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] englefly commented on a diff in pull request #20978: [fix](nereids) compute output size in correct way

Posted by "englefly (via GitHub)" <gi...@apache.org>.
englefly commented on code in PR #20978:
URL: https://github.com/apache/doris/pull/20978#discussion_r1233951854


##########
fe/fe-core/src/main/java/org/apache/doris/statistics/Statistics.java:
##########
@@ -149,21 +155,28 @@ public Statistics merge(Statistics statistics) {
         return this;
     }
 
-    private double computeTupleSize() {
+    private double computeTupleSize(Plan plan) {
         if (tupleSize <= 0) {
-            tupleSize = expressionToColumnStats.values().stream()
-                    .map(s -> s.avgSizeByte).reduce(0D, Double::sum);
+            tupleSize = 0;
+            for (Slot slot : plan.getOutput()) {
+                ColumnStatistic colStats = expressionToColumnStats.get(slot);

Review Comment:
   yes,expressionToColumnStats.keySet() and plan.getOutput() are not match



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] github-actions[bot] commented on pull request #20978: [fix](nereids) compute size bug

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #20978:
URL: https://github.com/apache/doris/pull/20978#issuecomment-1596726779

   PR approved by anyone and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] englefly commented on a diff in pull request #20978: [fix](nereids) compute output size in correct way

Posted by "englefly (via GitHub)" <gi...@apache.org>.
englefly commented on code in PR #20978:
URL: https://github.com/apache/doris/pull/20978#discussion_r1233959696


##########
fe/fe-core/src/main/java/org/apache/doris/statistics/Statistics.java:
##########
@@ -149,21 +155,28 @@ public Statistics merge(Statistics statistics) {
         return this;
     }
 
-    private double computeTupleSize() {
+    private double computeTupleSize(Plan plan) {
         if (tupleSize <= 0) {
-            tupleSize = expressionToColumnStats.values().stream()
-                    .map(s -> s.avgSizeByte).reduce(0D, Double::sum);
+            tupleSize = 0;
+            for (Slot slot : plan.getOutput()) {
+                ColumnStatistic colStats = expressionToColumnStats.get(slot);
+                if (colStats != null) {
+                    tupleSize += colStats.avgSizeByte;
+                } else {
+                    tupleSize += 8;

Review Comment:
   some output columns are not in that map.
   These columns are global Agg function column. for example, there is "partial_sum" column in map, but we want "sum".
     
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] englefly closed pull request #20978: [fix](nereids) compute output size in correct way

Posted by "englefly (via GitHub)" <gi...@apache.org>.
englefly closed pull request #20978: [fix](nereids) compute output size in correct way
URL: https://github.com/apache/doris/pull/20978


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] github-actions[bot] commented on pull request #20978: [fix](nereids) compute output size in correct way

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #20978:
URL: https://github.com/apache/doris/pull/20978#issuecomment-1597396250

   PR approved by at least one committer and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org