You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by "yiguolei (via GitHub)" <gi...@apache.org> on 2023/06/25 09:30:39 UTC
[GitHub] [doris] yiguolei opened a new pull request, #21141: [performace](colddata) opt cold data read performance
yiguolei opened a new pull request, #21141:
URL: https://github.com/apache/doris/pull/21141
## Proposed changes
In https://github.com/apache/doris/pull/10370, we try to opt string evaluate performance by rewrite the predicate using dict value. But it has to check if the string column is full dict encoding. So that we add a logic to read the last page of the string column to check it.
But it has some bad performance for cold data because it has to load the column's ordinal index and zone map index. In some scenario for example, select * from table where pk_col=1. If the query condition is primary key, the result maybe just a few rows but the result may have 100 columns, it will cost a lot of time to load these indices. We could find a lot of time is spending on block_init_time.
In my test, a table with 50 string columns and query with primary key.
The first read time will reduce from 220ms to 40ms.
## Further comments
If this is a relatively large or complex change, kick off the discussion at [dev@doris.apache.org](mailto:dev@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc...
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org
[GitHub] [doris] yiguolei commented on pull request #21141: [performace](colddata) opt cold data read performance
Posted by "yiguolei (via GitHub)" <gi...@apache.org>.
yiguolei commented on PR #21141:
URL: https://github.com/apache/doris/pull/21141#issuecomment-1605996157
run buildall
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org
[GitHub] [doris] hello-stephen commented on pull request #21141: [performace](colddata) opt cold data read performance
Posted by "hello-stephen (via GitHub)" <gi...@apache.org>.
hello-stephen commented on PR #21141:
URL: https://github.com/apache/doris/pull/21141#issuecomment-1606133263
TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 37.61 seconds
stream load tsv: 451 seconds loaded 74807831229 Bytes, about 158 MB/s
stream load json: 22 seconds loaded 2358488459 Bytes, about 102 MB/s
stream load orc: 57 seconds loaded 1101869774 Bytes, about 18 MB/s
stream load parquet: 29 seconds loaded 861443392 Bytes, about 28 MB/s
insert into select: 69.9 seconds inserted 10000000 Rows, about 143K ops/s
https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/tmp/20230625152845_clickbench_pr_167389.html
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org
[GitHub] [doris] mrhhsg commented on a diff in pull request #21141: [performace](colddata) opt cold data read performance
Posted by "mrhhsg (via GitHub)" <gi...@apache.org>.
mrhhsg commented on code in PR #21141:
URL: https://github.com/apache/doris/pull/21141#discussion_r1241157243
##########
be/src/olap/rowset/segment_v2/segment_iterator.cpp:
##########
@@ -983,6 +983,22 @@ Status SegmentIterator::_init_return_column_iterators() {
new RowIdColumnIterator(_opts.tablet_id, _opts.rowset_id, _segment->id()));
continue;
}
+ std::set<ColumnId> del_cond_id_set;
+ _opts.delete_condition_predicates->get_all_column_ids(del_cond_id_set);
+ std::vector<bool> tmp_is_pred_column;
+ tmp_is_pred_column.resize(_schema->columns().size(), false);
+ if (!_col_predicates.empty() || !del_cond_id_set.empty()) {
Review Comment:
没有必要判断 empty, 直接两个 for 遍历可能更好?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org
[GitHub] [doris] yiguolei merged pull request #21141: [performace](colddata) opt cold data read performance
Posted by "yiguolei (via GitHub)" <gi...@apache.org>.
yiguolei merged PR #21141:
URL: https://github.com/apache/doris/pull/21141
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #21141: [performace](colddata) opt cold data read performance
Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #21141:
URL: https://github.com/apache/doris/pull/21141#issuecomment-1606506114
PR approved by at least one committer and no changes requested.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org
[GitHub] [doris] yiguolei commented on pull request #21141: [performace](colddata) opt cold data read performance
Posted by "yiguolei (via GitHub)" <gi...@apache.org>.
yiguolei commented on PR #21141:
URL: https://github.com/apache/doris/pull/21141#issuecomment-1606110034
run buildall
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #21141: [performace](colddata) opt cold data read performance
Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #21141:
URL: https://github.com/apache/doris/pull/21141#issuecomment-1605991178
clang-tidy review says "All clean, LGTM! :+1:"
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #21141: [performace](colddata) opt cold data read performance
Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #21141:
URL: https://github.com/apache/doris/pull/21141#issuecomment-1606068846
clang-tidy review says "All clean, LGTM! :+1:"
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #21141: [performace](colddata) opt cold data read performance
Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #21141:
URL: https://github.com/apache/doris/pull/21141#issuecomment-1606071166
PR approved by anyone and no changes requested.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #21141: [performace](colddata) opt cold data read performance
Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #21141:
URL: https://github.com/apache/doris/pull/21141#issuecomment-1606069875
clang-tidy review says "All clean, LGTM! :+1:"
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #21141: [performace](colddata) opt cold data read performance
Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #21141:
URL: https://github.com/apache/doris/pull/21141#issuecomment-1605990825
clang-tidy review says "All clean, LGTM! :+1:"
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org