You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2022/10/16 05:15:43 UTC
[GitHub] [doris] mrhhsg opened a new pull request, #13397: [improvement](scan) speed up inserting strings into ColumnString
mrhhsg opened a new pull request, #13397:
URL: https://github.com/apache/doris/pull/13397
# Proposed changes
Test sql:
```sql
select count(distinct url) from hits where UserID % 5 = 0;
```
without this opt:
<img width="1655" alt="image" src="https://user-images.githubusercontent.com/1179834/196019282-32dd905d-5cfd-40e7-8de0-cb402652c903.png">
with this opt:
<img width="834" alt="image" src="https://user-images.githubusercontent.com/1179834/196019272-5d169f5a-4a82-4efd-baf8-98d2972fc51b.png">
## Problem summary
Describe your changes.
## Checklist(Required)
1. Does it affect the original behavior:
- [ ] Yes
- [ ] No
- [ ] I don't know
2. Has unit tests been added:
- [ ] Yes
- [ ] No
- [ ] No Need
3. Has document been added or modified:
- [ ] Yes
- [ ] No
- [ ] No Need
4. Does it need to update dependencies:
- [ ] Yes
- [ ] No
5. Are there any changes that cannot be rolled back:
- [ ] Yes (If Yes, please explain WHY)
- [ ] No
## Further comments
If this is a relatively large or complex change, kick off the discussion at [dev@doris.apache.org](mailto:dev@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc...
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #13397: [improvement](scan) speed up inserting strings into ColumnString
Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #13397:
URL: https://github.com/apache/doris/pull/13397#issuecomment-1296141864
PR approved by anyone and no changes requested.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org
[GitHub] [doris] HappenLee commented on a diff in pull request #13397: [improvement](scan) speed up inserting strings into ColumnString
Posted by GitBox <gi...@apache.org>.
HappenLee commented on code in PR #13397:
URL: https://github.com/apache/doris/pull/13397#discussion_r996661974
##########
be/src/vec/columns/predicate_column.h:
##########
@@ -91,13 +91,17 @@ class PredicateColumnType final : public COWHelper<IColumn, PredicateColumnType<
void insert_string_to_res_column(const uint16_t* sel, size_t sel_size,
vectorized::ColumnString* res_ptr) {
StringRef refs[sel_size];
+ size_t length = 0;
for (size_t i = 0; i < sel_size; i++) {
uint16_t n = sel[i];
auto& sv = reinterpret_cast<StringValue&>(data[n]);
refs[i].data = sv.ptr;
refs[i].size = sv.len;
+ length += sv.len;
}
- res_ptr->insert_many_continuous_strings(refs, sel_size);
Review Comment:
only here to call `insert_many_continuous_strings` after delete the code, we should del the funciton
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org
[GitHub] [doris] hello-stephen commented on pull request #13397: [improvement](scan) speed up inserting strings into ColumnString
Posted by GitBox <gi...@apache.org>.
hello-stephen commented on PR #13397:
URL: https://github.com/apache/doris/pull/13397#issuecomment-1293576658
TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 37.78 seconds
load time: 569 seconds
storage size: 17154699360 Bytes
https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/tmp/20221027220431_clickbench_pr_34852.html
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org
[GitHub] [doris] HappenLee merged pull request #13397: [improvement](scan) speed up inserting strings into ColumnString
Posted by GitBox <gi...@apache.org>.
HappenLee merged PR #13397:
URL: https://github.com/apache/doris/pull/13397
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #13397: [improvement](scan) speed up inserting strings into ColumnString
Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #13397:
URL: https://github.com/apache/doris/pull/13397#issuecomment-1296141851
PR approved by at least one committer and no changes requested.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org
[GitHub] [doris] hello-stephen commented on pull request #13397: [improvement](scan) speed up inserting strings into ColumnString
Posted by GitBox <gi...@apache.org>.
hello-stephen commented on PR #13397:
URL: https://github.com/apache/doris/pull/13397#issuecomment-1294290570
TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 38.15 seconds
load time: 575 seconds
storage size: 17154820913 Bytes
https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/tmp/20221028091430_clickbench_pr_35001.html
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org