You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2022/06/27 16:22:17 UTC

[GitHub] [doris] mrhhsg opened a new pull request, #10460: [fix]Core dump in SegmentIterator when there is no predicate

mrhhsg opened a new pull request, #10460:
URL: https://github.com/apache/doris/pull/10460

   # Proposed changes
   
   Issue Number: close #10453 
   
   ## Problem Summary:
   
   Describe the overview of changes.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: (Yes/No/I Don't know)
   2. Has unit tests been added: (Yes/No/No Need)
   3. Has document been added or modified: (Yes/No/No Need)
   4. Does it need to update dependencies: (Yes/No)
   5. Are there any changes that cannot be rolled back: (Yes/No)
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at [dev@doris.apache.org](mailto:dev@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc...
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] BiteTheDDDDt commented on pull request #10460: [fix]Core dump in SegmentIterator when there is no predicate

Posted by GitBox <gi...@apache.org>.
BiteTheDDDDt commented on PR #10460:
URL: https://github.com/apache/doris/pull/10460#issuecomment-1168191257

   ```sql
   MySQL [test]> select count(*) from test;
   +----------+
   | count(*) |
   +----------+
   |    60017 |
   +----------+
   1 row in set (0.181 sec)
   
   MySQL [test]> select count(*) from test;
   +----------+
   | count(*) |
   +----------+
   |    45150 |
   +----------+
   1 row in set (0.036 sec)
   ```
   Hi, I do test and get wrong result


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] Gabriel39 commented on a diff in pull request #10460: [fix]Core dump in SegmentIterator when there is no predicate

Posted by GitBox <gi...@apache.org>.
Gabriel39 commented on code in PR #10460:
URL: https://github.com/apache/doris/pull/10460#discussion_r907941300


##########
be/src/olap/rowset/segment_v2/segment_iterator.h:
##########
@@ -159,6 +159,7 @@ class SegmentIterator : public RowwiseIterator {
     std::vector<ColumnId>
             _short_cir_pred_column_ids; // keep columnId of columns for short circuit predicate evaluation
     std::vector<bool> _is_first_read_column; // columns hold by segmentIter
+    std::vector<bool> _is_predicate_column;  // columns hold by segmentIter

Review Comment:
   try not to use `vector<bool>`. Use `vector<char>` or `bitset` instead



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] mrhhsg commented on a diff in pull request #10460: [fix]Core dump in SegmentIterator when there is no predicate

Posted by GitBox <gi...@apache.org>.
mrhhsg commented on code in PR #10460:
URL: https://github.com/apache/doris/pull/10460#discussion_r907987843


##########
be/src/olap/rowset/segment_v2/segment_iterator.h:
##########
@@ -159,6 +159,7 @@ class SegmentIterator : public RowwiseIterator {
     std::vector<ColumnId>
             _short_cir_pred_column_ids; // keep columnId of columns for short circuit predicate evaluation
     std::vector<bool> _is_first_read_column; // columns hold by segmentIter
+    std::vector<bool> _is_predicate_column;  // columns hold by segmentIter

Review Comment:
   This vector will not be accessed frequently, so I think using `vector<bool>` is ok?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] yiguolei closed pull request #10460: [fix]Core dump in SegmentIterator when there is no predicate

Posted by GitBox <gi...@apache.org>.
yiguolei closed pull request #10460: [fix]Core dump in SegmentIterator when there is no predicate
URL: https://github.com/apache/doris/pull/10460


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] yiguolei commented on a diff in pull request #10460: [fix]Core dump in SegmentIterator when there is no predicate

Posted by GitBox <gi...@apache.org>.
yiguolei commented on code in PR #10460:
URL: https://github.com/apache/doris/pull/10460#discussion_r907920966


##########
be/src/olap/rowset/segment_v2/segment_iterator.cpp:
##########
@@ -1007,21 +1010,23 @@ Status SegmentIterator::next_batch(vectorized::Block* block) {
         for (size_t i = 0; i < _schema.num_column_ids(); i++) {
             auto cid = _schema.column_id(i);
             auto column_desc = _schema.column(cid);
-            if (_is_first_read_column[cid]) {
-                _current_return_columns[cid] = Schema::get_predicate_column_nullable_ptr(
-                        column_desc->type(), column_desc->is_nullable());
-                _current_return_columns[cid]->reserve(_opts.block_row_max);
-            } else if (i >= block->columns()) {
-                // if i >= block->columns means the column and not the pred_column means `column i` is
-                // a delete condition column. but the column is not effective in the segment. so we just
-                // create a column to hold the data.
-                // a. origin data -> b. delete condition -> c. new load data
-                // the segment of c do not effective delete condition, but it still need read the column
-                // to match the schema.
-                // TODO: skip read the not effective delete column to speed up segment read.
-                _current_return_columns[cid] =
-                        Schema::get_data_type_ptr(*column_desc)->create_column();
-                _current_return_columns[cid]->reserve(_opts.block_row_max);
+            if (_is_need_vec_eval || _is_need_short_eval) {
+                if (_is_predicate_column[cid]) {
+                    _current_return_columns[cid] = Schema::get_predicate_column_nullable_ptr(
+                            column_desc->type(), column_desc->is_nullable());
+                    _current_return_columns[cid]->reserve(_opts.block_row_max);
+                } else if (_is_first_read_column[cid] || i >= block->columns()) {
+                    // if i >= block->columns means the column and not the pred_column means `column i` is
+                    // a delete condition column. but the column is not effective in the segment. so we just
+                    // create a column to hold the data.
+                    // a. origin data -> b. delete condition -> c. new load data
+                    // the segment of c do not effective delete condition, but it still need read the column
+                    // to match the schema.
+                    // TODO: skip read the not effective delete column to speed up segment read.
+                    _current_return_columns[cid] =
+                            Schema::get_data_type_ptr(*column_desc)->create_column();

Review Comment:
   这个地方是不对的: 对于block 中已经有的column,他原本是要做内存重用的,不会在这生成column



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org