You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2023/01/17 12:13:50 UTC

[GitHub] [doris] kaka11chen opened a new issue, #16023: [Enhancement] Optimize the position delete file filtering mechanism in iceberg v2.

kaka11chen opened a new issue, #16023:
URL: https://github.com/apache/doris/issues/16023

   ### Search before asking
   
   - [X] I had searched in the [issues](https://github.com/apache/doris/issues?q=is%3Aissue) and found no similar issues.
   
   
   ### Description
   
   Currently, after obtaining the delete position through Iceberg position delete files, the parquet reader adopts a mechanism of dividing the delete position into different row ranges to filter. When there are too many delete positions, the number of virtual function calls to read decoded columns will increase, and `column_data.resize()` will be called too many times。
   
   parquet_common.h
   ```
   template <typename Numeric>
   Status FixLengthDecoder::_decode_numeric(MutableColumnPtr& doris_column,
                                            ColumnSelectVector& select_vector) {
       auto& column_data = static_cast<ColumnVector<Numeric>&>(*doris_column).get_data();
       size_t data_index = column_data.size();
       column_data.resize(data_index + select_vector.num_values() - select_vector.num_filtered());
       ...
   ```
   
   ### Solution
   
   Merge delete position filter with condition filter to handle it.
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] morningman closed issue #16023: [Enhancement] Optimize the position delete file filtering mechanism in iceberg v2.

Posted by "morningman (via GitHub)" <gi...@apache.org>.
morningman closed issue #16023: [Enhancement] Optimize the position delete file filtering mechanism in iceberg v2.
URL: https://github.com/apache/doris/issues/16023


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org