You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2022/10/16 05:15:43 UTC

[GitHub] [doris] mrhhsg opened a new pull request, #13397: [improvement](scan) speed up inserting strings into ColumnString

mrhhsg opened a new pull request, #13397:
URL: https://github.com/apache/doris/pull/13397

   # Proposed changes
   Test sql:
   ```sql
   select count(distinct url) from hits where UserID % 5 = 0;
   ```
   without this opt:
   <img width="1655" alt="image" src="https://user-images.githubusercontent.com/1179834/196019282-32dd905d-5cfd-40e7-8de0-cb402652c903.png">
   
   with this opt:
   <img width="834" alt="image" src="https://user-images.githubusercontent.com/1179834/196019272-5d169f5a-4a82-4efd-baf8-98d2972fc51b.png">
   
   
   ## Problem summary
   
   Describe your changes.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: 
       - [ ] Yes
       - [ ] No
       - [ ] I don't know
   2. Has unit tests been added:
       - [ ] Yes
       - [ ] No
       - [ ] No Need
   3. Has document been added or modified:
       - [ ] Yes
       - [ ] No
       - [ ] No Need
   4. Does it need to update dependencies:
       - [ ] Yes
       - [ ] No
   5. Are there any changes that cannot be rolled back:
       - [ ] Yes (If Yes, please explain WHY)
       - [ ] No
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at [dev@doris.apache.org](mailto:dev@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] github-actions[bot] commented on pull request #13397: [improvement](scan) speed up inserting strings into ColumnString

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #13397:
URL: https://github.com/apache/doris/pull/13397#issuecomment-1296141864

   PR approved by anyone and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] HappenLee commented on a diff in pull request #13397: [improvement](scan) speed up inserting strings into ColumnString

Posted by GitBox <gi...@apache.org>.
HappenLee commented on code in PR #13397:
URL: https://github.com/apache/doris/pull/13397#discussion_r996661974


##########
be/src/vec/columns/predicate_column.h:
##########
@@ -91,13 +91,17 @@ class PredicateColumnType final : public COWHelper<IColumn, PredicateColumnType<
     void insert_string_to_res_column(const uint16_t* sel, size_t sel_size,
                                      vectorized::ColumnString* res_ptr) {
         StringRef refs[sel_size];
+        size_t length = 0;
         for (size_t i = 0; i < sel_size; i++) {
             uint16_t n = sel[i];
             auto& sv = reinterpret_cast<StringValue&>(data[n]);
             refs[i].data = sv.ptr;
             refs[i].size = sv.len;
+            length += sv.len;
         }
-        res_ptr->insert_many_continuous_strings(refs, sel_size);

Review Comment:
   only here to call `insert_many_continuous_strings` after delete the code, we should del the funciton



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] hello-stephen commented on pull request #13397: [improvement](scan) speed up inserting strings into ColumnString

Posted by GitBox <gi...@apache.org>.
hello-stephen commented on PR #13397:
URL: https://github.com/apache/doris/pull/13397#issuecomment-1293576658

   TeamCity pipeline, clickbench performance test result:
    the sum of best hot time: 37.78 seconds
    load time: 569 seconds
    storage size: 17154699360 Bytes
    https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/tmp/20221027220431_clickbench_pr_34852.html


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] HappenLee merged pull request #13397: [improvement](scan) speed up inserting strings into ColumnString

Posted by GitBox <gi...@apache.org>.
HappenLee merged PR #13397:
URL: https://github.com/apache/doris/pull/13397


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] github-actions[bot] commented on pull request #13397: [improvement](scan) speed up inserting strings into ColumnString

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #13397:
URL: https://github.com/apache/doris/pull/13397#issuecomment-1296141851

   PR approved by at least one committer and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] hello-stephen commented on pull request #13397: [improvement](scan) speed up inserting strings into ColumnString

Posted by GitBox <gi...@apache.org>.
hello-stephen commented on PR #13397:
URL: https://github.com/apache/doris/pull/13397#issuecomment-1294290570

   TeamCity pipeline, clickbench performance test result:
    the sum of best hot time: 38.15 seconds
    load time: 575 seconds
    storage size: 17154820913 Bytes
    https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/tmp/20221028091430_clickbench_pr_35001.html


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org