You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2022/07/21 11:35:42 UTC

[GitHub] [doris] mrhhsg opened a new pull request, #11084: [improvement][agg]Process aggregated results in the vectorized way

mrhhsg opened a new pull request, #11084:
URL: https://github.com/apache/doris/pull/11084

   # Proposed changes
   
   Issue Number: close #xxx
   
   ## Problem Summary:
   
   Aggregated results are currently processed row by row, this is an inefficient way of accessing memory and will call virtual functions too many times.
   
   Here test on clickbench(https://github.com/ClickHouse/ClickBench)  with SQL:
   ```sql
   SELECT UserID, COUNT(*) FROM hits GROUP BY UserID ORDER BY COUNT(*) DESC LIMIT 10;
   ```
   master:
   <img width="1301" alt="image" src="https://user-images.githubusercontent.com/1179834/180204059-997f16f1-9b51-4351-9b33-bbd8409b6b7a.png">
   
   agg_result_vec:
   <img width="1387" alt="image" src="https://user-images.githubusercontent.com/1179834/180204114-70655f00-5b38-4d6d-9069-fefaac541026.png">
   
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: (Yes/No/I Don't know)
   2. Has unit tests been added: (Yes/No/No Need)
   3. Has document been added or modified: (Yes/No/No Need)
   4. Does it need to update dependencies: (Yes/No)
   5. Are there any changes that cannot be rolled back: (Yes/No)
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at [dev@doris.apache.org](mailto:dev@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc...
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] adonis0147 commented on pull request #11084: [improvement][agg]Process aggregated results in the vectorized way

Posted by GitBox <gi...@apache.org>.
adonis0147 commented on PR #11084:
URL: https://github.com/apache/doris/pull/11084#issuecomment-1192262477

   `Clang build` failed. Please fix it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] yiguolei merged pull request #11084: [improvement][agg]Process aggregated results in the vectorized way

Posted by GitBox <gi...@apache.org>.
yiguolei merged PR #11084:
URL: https://github.com/apache/doris/pull/11084


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] github-actions[bot] commented on pull request #11084: [improvement][agg]Process aggregated results in the vectorized way

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #11084:
URL: https://github.com/apache/doris/pull/11084#issuecomment-1191434801

   PR approved by at least one committer and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] github-actions[bot] commented on pull request #11084: [improvement][agg]Process aggregated results in the vectorized way

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #11084:
URL: https://github.com/apache/doris/pull/11084#issuecomment-1191434864

   PR approved by anyone and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] HappenLee commented on a diff in pull request #11084: [improvement][agg]Process aggregated results in the vectorized way

Posted by GitBox <gi...@apache.org>.
HappenLee commented on code in PR #11084:
URL: https://github.com/apache/doris/pull/11084#discussion_r927262536


##########
be/src/vec/columns/column_nullable.cpp:
##########
@@ -152,6 +166,17 @@ void ColumnNullable::serialize_vec(std::vector<StringRef>& keys, size_t num_rows
     get_nested_column().serialize_vec_with_null_map(keys, num_rows, arr.data(), max_row_byte_size);
 }
 
+void ColumnNullable::deserialize_vec(std::vector<StringRef>& keys, const size_t num_rows) {
+    const auto& arr = get_null_map_data();
+    for (size_t i = 0; i != num_rows; ++i) {

Review Comment:
   this code seems can SIMD. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org