You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by "xiaokang (via GitHub)" <gi...@apache.org> on 2023/06/28 15:47:58 UTC

[GitHub] [doris] xiaokang opened a new pull request, #21310: [bugfix](ngram bf index) process differently for normal bloom filter index and ngram bf index

xiaokang opened a new pull request, #21310:
URL: https://github.com/apache/doris/pull/21310

   ## Proposed changes
   
   Issue Number: close #xxx
   
   <!--Describe your changes.-->
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at [dev@doris.apache.org](mailto:dev@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] airborne12 commented on a diff in pull request #21310: [bugfix](ngram bf index) process differently for normal bloom filter index and ngram bf index

Posted by "airborne12 (via GitHub)" <gi...@apache.org>.
airborne12 commented on code in PR #21310:
URL: https://github.com/apache/doris/pull/21310#discussion_r1248374762


##########
be/src/olap/null_predicate.h:
##########
@@ -92,7 +94,7 @@ class NullPredicate : public ColumnPredicate {
         }
     }
 
-    bool can_do_bloom_filter() const override { return _is_null; }
+    bool can_do_bloom_filter(bool ngram) const override { return ngram ? false : _is_null; }

Review Comment:
   may be `bool can_do_bloom_filter(bool ngram) const override { return !ngram && _is_null; }` is better
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] xiaokang commented on a diff in pull request #21310: [bugfix](ngram bf index) process differently for normal bloom filter index and ngram bf index

Posted by "xiaokang (via GitHub)" <gi...@apache.org>.
xiaokang commented on code in PR #21310:
URL: https://github.com/apache/doris/pull/21310#discussion_r1253803673


##########
be/src/olap/null_predicate.h:
##########
@@ -92,7 +94,7 @@ class NullPredicate : public ColumnPredicate {
         }
     }
 
-    bool can_do_bloom_filter() const override { return _is_null; }
+    bool can_do_bloom_filter(bool ngram) const override { return ngram ? false : _is_null; }

Review Comment:
   changed



##########
be/src/olap/rowset/segment_v2/column_reader.h:
##########
@@ -137,7 +137,12 @@ class ColumnReader {
 
     bool has_zone_map() const { return _zone_map_index_meta != nullptr; }
     bool has_bitmap_index() const { return _bitmap_index_meta != nullptr; }
-    bool has_bloom_filter_index() const { return _bf_index_meta != nullptr; }
+    bool has_bloom_filter_index(bool ngram) const {
+        return _bf_index_meta != nullptr &&
+               (ngram ? (_bf_index_meta->algorithm() == BloomFilterAlgorithmPB::NGRAM_BLOOM_FILTER)
+                      : (_bf_index_meta->algorithm() !=
+                         BloomFilterAlgorithmPB::NGRAM_BLOOM_FILTER));
+    }

Review Comment:
   changed



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] github-actions[bot] commented on pull request #21310: [bugfix](ngram bf index) process differently for normal bloom filter index and ngram bf index

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #21310:
URL: https://github.com/apache/doris/pull/21310#issuecomment-1633429878

   PR approved by at least one committer and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] xiaokang commented on pull request #21310: [bugfix](ngram bf index) process differently for normal bloom filter index and ngram bf index

Posted by "xiaokang (via GitHub)" <gi...@apache.org>.
xiaokang commented on PR #21310:
URL: https://github.com/apache/doris/pull/21310#issuecomment-1633401427

   run buildall 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] github-actions[bot] commented on pull request #21310: [bugfix](ngram bf index) process differently for normal bloom filter index and ngram bf index

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #21310:
URL: https://github.com/apache/doris/pull/21310#issuecomment-1629835265

   PR approved by anyone and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] hello-stephen commented on pull request #21310: [bugfix](ngram bf index) process differently for normal bloom filter index and ngram bf index

Posted by "hello-stephen (via GitHub)" <gi...@apache.org>.
hello-stephen commented on PR #21310:
URL: https://github.com/apache/doris/pull/21310#issuecomment-1633430745

   (From new machine)TeamCity pipeline, clickbench performance test result:
    the sum of best hot time: 52.64 seconds
    stream load tsv:          512 seconds loaded 74807831229 Bytes, about 139 MB/s
    stream load json:         19 seconds loaded 2358488459 Bytes, about 118 MB/s
    stream load orc:          65 seconds loaded 1101869774 Bytes, about 16 MB/s
    stream load parquet:          32 seconds loaded 861443392 Bytes, about 25 MB/s
    insert into select:          29.2 seconds inserted 10000000 Rows, about 342K ops/s
    storage size: 17164968599 Bytes
    https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/tmp/20230713100929_clickbench_pr_177611.html


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] github-actions[bot] commented on pull request #21310: [bugfix](ngram bf index) process differently for normal bloom filter index and ngram bf index

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #21310:
URL: https://github.com/apache/doris/pull/21310#issuecomment-1611693018

   clang-tidy review says "All clean, LGTM! :+1:"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] compasses commented on a diff in pull request #21310: [bugfix](ngram bf index) process differently for normal bloom filter index and ngram bf index

Posted by "compasses (via GitHub)" <gi...@apache.org>.
compasses commented on code in PR #21310:
URL: https://github.com/apache/doris/pull/21310#discussion_r1246150316


##########
be/src/olap/rowset/segment_v2/column_reader.h:
##########
@@ -137,7 +137,12 @@ class ColumnReader {
 
     bool has_zone_map() const { return _zone_map_index_meta != nullptr; }
     bool has_bitmap_index() const { return _bitmap_index_meta != nullptr; }
-    bool has_bloom_filter_index() const { return _bf_index_meta != nullptr; }
+    bool has_bloom_filter_index(bool ngram) const {
+        return _bf_index_meta != nullptr &&
+               (ngram ? (_bf_index_meta->algorithm() == BloomFilterAlgorithmPB::NGRAM_BLOOM_FILTER)
+                      : (_bf_index_meta->algorithm() !=
+                         BloomFilterAlgorithmPB::NGRAM_BLOOM_FILTER));
+    }

Review Comment:
   这个判断有点复杂:)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] github-actions[bot] commented on pull request #21310: [bugfix](ngram bf index) process differently for normal bloom filter index and ngram bf index

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #21310:
URL: https://github.com/apache/doris/pull/21310#issuecomment-1633408097

   clang-tidy review says "All clean, LGTM! :+1:"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] xiaokang commented on pull request #21310: [bugfix](ngram bf index) process differently for normal bloom filter index and ngram bf index

Posted by "xiaokang (via GitHub)" <gi...@apache.org>.
xiaokang commented on PR #21310:
URL: https://github.com/apache/doris/pull/21310#issuecomment-1612260322

   run buildall 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] Gabriel39 commented on a diff in pull request #21310: [bugfix](ngram bf index) process differently for normal bloom filter index and ngram bf index

Posted by "Gabriel39 (via GitHub)" <gi...@apache.org>.
Gabriel39 commented on code in PR #21310:
URL: https://github.com/apache/doris/pull/21310#discussion_r1257631043


##########
be/src/olap/in_list_predicate.h:
##########
@@ -381,6 +381,8 @@ class InListPredicateBase : public ColumnPredicate {
 
     bool evaluate_and(const segment_v2::BloomFilter* bf) const override {
         if constexpr (PT == PredicateType::IN_LIST) {
+            // IN predicate can not use ngram bf, just return true to accept
+            if (bf->is_ngram_bf()) return true;

Review Comment:
   For delete condition, is this always true?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] eldenmoon merged pull request #21310: [bugfix](ngram bf index) process differently for normal bloom filter index and ngram bf index

Posted by "eldenmoon (via GitHub)" <gi...@apache.org>.
eldenmoon merged PR #21310:
URL: https://github.com/apache/doris/pull/21310


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] github-actions[bot] commented on pull request #21310: [bugfix](ngram bf index) process differently for normal bloom filter index and ngram bf index

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #21310:
URL: https://github.com/apache/doris/pull/21310#issuecomment-1622766520

   clang-tidy review says "All clean, LGTM! :+1:"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] xiaokang commented on pull request #21310: [bugfix](ngram bf index) process differently for normal bloom filter index and ngram bf index

Posted by "xiaokang (via GitHub)" <gi...@apache.org>.
xiaokang commented on PR #21310:
URL: https://github.com/apache/doris/pull/21310#issuecomment-1622790394

   run buildall 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] compasses commented on pull request #21310: [bugfix](ngram bf index) process differently for normal bloom filter index and ngram bf index

Posted by "compasses (via GitHub)" <gi...@apache.org>.
compasses commented on PR #21310:
URL: https://github.com/apache/doris/pull/21310#issuecomment-1612478785

   LGTM


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org