You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2020/01/01 09:15:21 UTC
[GitHub] [incubator-doris] morningman opened a new pull request #2632:
[Rowset Reader] Improve the merge read efficiency of alpha rowsets
morningman opened a new pull request #2632: [Rowset Reader] Improve the merge read efficiency of alpha rowsets
URL: https://github.com/apache/incubator-doris/pull/2632
When merge reads from multi rowsets, or one rowset with multi overlapping segments,
I introduce a priority queue(A Minimum heap data structure) for multipath merge sort,
to replace the old O(N^2) time complexity algorithm.
This can significantly improve the read efficiency when merging large number of
overlapping data.
In mytest:
1. Compaction with 187 segments reduce time from 75 seconds to 42 seconds
2. Compaction with 3574 segments cost 43 seconds, and with old version, I kill the
process after waiting more than 10 minutes...
This CL only change the reads of alpha rowset. Beta rowset will be changed in another CL.
ISSUE: #2631
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org
[GitHub] [incubator-doris] morningman merged pull request #2632: [Rowset
Reader] Improve the merge read efficiency of alpha rowsets
Posted by GitBox <gi...@apache.org>.
morningman merged pull request #2632: [Rowset Reader] Improve the merge read efficiency of alpha rowsets
URL: https://github.com/apache/incubator-doris/pull/2632
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org
[GitHub] [incubator-doris] imay commented on a change in pull request #2632:
[Rowset Reader] Improve the merge read efficiency of alpha rowsets
Posted by GitBox <gi...@apache.org>.
imay commented on a change in pull request #2632: [Rowset Reader] Improve the merge read efficiency of alpha rowsets
URL: https://github.com/apache/incubator-doris/pull/2632#discussion_r362320010
##########
File path: be/src/olap/rowset/alpha_rowset_reader.h
##########
@@ -103,6 +128,9 @@ class AlphaRowsetReader : public RowsetReader {
RowsetReaderContext* _current_read_context;
OlapReaderStatistics _owned_stats;
OlapReaderStatistics* _stats = &_owned_stats;
+
+ // a priority queue for merging rowsets
+ std::priority_queue<RowCursorWithOrdinal, vector<RowCursorWithOrdinal>, RowCursorWithOrdinalComparator> _merge_queue;
Review comment:
why not use use **RowCursorWithOrdinal** as element of queue. If then, there is no need to record ordinal.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org
[GitHub] [incubator-doris] morningman commented on a change in pull request
#2632: [Rowset Reader] Improve the merge read efficiency of alpha rowsets
Posted by GitBox <gi...@apache.org>.
morningman commented on a change in pull request #2632: [Rowset Reader] Improve the merge read efficiency of alpha rowsets
URL: https://github.com/apache/incubator-doris/pull/2632#discussion_r362377473
##########
File path: be/src/olap/rowset/alpha_rowset_reader.h
##########
@@ -103,6 +128,9 @@ class AlphaRowsetReader : public RowsetReader {
RowsetReaderContext* _current_read_context;
OlapReaderStatistics _owned_stats;
OlapReaderStatistics* _stats = &_owned_stats;
+
+ // a priority queue for merging rowsets
+ std::priority_queue<RowCursorWithOrdinal, vector<RowCursorWithOrdinal>, RowCursorWithOrdinalComparator> _merge_queue;
Review comment:
I removed the RowCursorWithOrdinal, and use MergeContext directly
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org
[GitHub] [incubator-doris] imay commented on a change in pull request #2632:
[Rowset Reader] Improve the merge read efficiency of alpha rowsets
Posted by GitBox <gi...@apache.org>.
imay commented on a change in pull request #2632: [Rowset Reader] Improve the merge read efficiency of alpha rowsets
URL: https://github.com/apache/incubator-doris/pull/2632#discussion_r362319571
##########
File path: be/src/olap/rowset/alpha_rowset_reader.cpp
##########
@@ -326,4 +382,8 @@ RowsetSharedPtr AlphaRowsetReader::rowset() {
return std::static_pointer_cast<Rowset>(_rowset);
}
+bool RowCursorWithOrdinalComparator::operator () (const RowCursorWithOrdinal &x, const RowCursorWithOrdinal &y) const {
Review comment:
```suggestion
bool RowCursorWithOrdinalComparator::operator () (const RowCursorWithOrdinal& x, const RowCursorWithOrdinal& y) const {
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org