You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@carbondata.apache.org by xuchuanyin <gi...@git.apache.org> on 2018/07/26 15:29:40 UTC

[GitHub] carbondata pull request #2565: [HotFix][CARBONDATA-2788][BloomDataMap] Rever...

GitHub user xuchuanyin opened a pull request:

    https://github.com/apache/carbondata/pull/2565

    [HotFix][CARBONDATA-2788][BloomDataMap] Revert optimization for blockletId in rebuilding datamap

    We found querying huge data with rebuilding bloom datamap will give
    incorrect result. The root cause is that the blockletId in
    ResultCollector is wrong. (This was introduced in PR2539)
    We will revert the previous modification for this. Now it is checked and
    works fine.
    
    Be sure to do all of the following checklist to help us incorporate 
    your contribution quickly and easily:
    
     - [ ] Any interfaces changed?
     
     - [ ] Any backward compatibility impacted?
     
     - [ ] Document update required?
    
     - [ ] Testing done
            Please provide details on 
            - Whether new unit test cases have been added or why no new tests are required?
            - How it is tested? Please attach test report.
            - Is it a performance related change? Please attach the performance test report.
            - Any additional information to help reviewers in testing this change.
           
     - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. 
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/xuchuanyin/carbondata 0726_revert_rebuild_rdd_blockletno

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/2565.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2565
    
----
commit 8889078ea9d1328366dc27d633b3f5ebf1906322
Author: xuchuanyin <xu...@...>
Date:   2018-07-26T15:22:58Z

    Revert optimize blockletId in rebuilding datamap
    
    We found querying huge data with rebuilding bloom datamap will give
    incorrect result. The root cause is that the blockletId in
    ResultCollector is wrong. (This was introduced in PR2539)
    We will revert the previous modification for this. Now it is checked and
    works fine.

----


---

[GitHub] carbondata pull request #2565: [HotFix][CARBONDATA-2788][BloomDataMap] Fix b...

Posted by xuchuanyin <gi...@git.apache.org>.
Github user xuchuanyin commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2565#discussion_r206367929
  
    --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/datamap/lucene/LuceneFineGrainDataMapSuite.scala ---
    @@ -371,7 +371,7 @@ class LuceneFineGrainDataMapSuite extends QueryTest with BeforeAndAfterAll {
           """
             | CREATE TABLE datamap_test_table(id INT, name STRING, city STRING, age INT)
             | STORED BY 'carbondata'
    -        | TBLPROPERTIES('SORT_COLUMNS'='city,name', 'SORT_SCOPE'='GLOBAL_SORT')
    +        | TBLPROPERTIES('SORT_COLUMNS'='city,name', 'SORT_SCOPE'='GLOBAL_SORT', 'CACHE_LEVEL'='BLOCKLET')
    --- End diff --
    
    By default the cache_level is BLOCK which may affect the pruning info. In some test cases in this file, they assert on the content of pruning info. So here, I just change the cache_level to BLOCKLET, so that I do not to modify the assertion.


---

[GitHub] carbondata issue #2565: [HotFix][CARBONDATA-2788][BloomDataMap] Fix bugs in ...

Posted by chenliang613 <gi...@git.apache.org>.
Github user chenliang613 commented on the issue:

    https://github.com/apache/carbondata/pull/2565
  
    verified, LGTM


---

[GitHub] carbondata pull request #2565: [HotFix][CARBONDATA-2788][BloomDataMap] Fix b...

Posted by xuchuanyin <gi...@git.apache.org>.
Github user xuchuanyin commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2565#discussion_r206367642
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockDataMap.java ---
    @@ -318,13 +318,22 @@ private DataMapRowImpl loadBlockMetaInfo(CarbonRowSchema[] taskSummarySchema,
                   blockMinValues, blockMaxValues);
           blockletCountInEachBlock.add(totalBlockletsInOneBlock);
         }
    -    byte[] blockletCount = ArrayUtils
    -        .toPrimitive(blockletCountInEachBlock.toArray(new Byte[blockletCountInEachBlock.size()]));
    +    byte[] blockletCount = convertRowCountFromShortToByteArray(blockletCountInEachBlock);
         // blocklet count index is the last index
         summaryRow.setByteArray(blockletCount, taskSummarySchema.length - 1);
         return summaryRow;
       }
     
    +  private byte[] convertRowCountFromShortToByteArray(List<Short> blockletCountInEachBlock) {
    --- End diff --
    
    because we are using offheap store, which needs to  store the bytes.


---

[GitHub] carbondata issue #2565: [HotFix][CARBONDATA-2788][BloomDataMap] Fix bugs in ...

Posted by xuchuanyin <gi...@git.apache.org>.
Github user xuchuanyin commented on the issue:

    https://github.com/apache/carbondata/pull/2565
  
    retest this please


---

[GitHub] carbondata pull request #2565: [HotFix][CARBONDATA-2788][BloomDataMap] Fix b...

Posted by xuchuanyin <gi...@git.apache.org>.
Github user xuchuanyin commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2565#discussion_r205934275
  
    --- Diff: datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMap.java ---
    @@ -165,9 +180,11 @@ public void initIndexColumnConverters(CarbonTable carbonTable, List<CarbonColumn
           for (CarbonBloomFilter bloomFilter : bloomIndexList) {
             boolean scanRequired = bloomFilter.membershipTest(new Key(bloomQueryModel.filterValue));
             if (scanRequired) {
    +          String blockletNo =
    +              isBlockletCacheLevel ? String.valueOf(bloomFilter.getBlockletNo()) : "-1";
    --- End diff --
    
    ok, fixed


---

[GitHub] carbondata issue #2565: [HotFix][CARBONDATA-2788][BloomDataMap] Fix bugs in ...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2565
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6382/



---

[GitHub] carbondata pull request #2565: [HotFix][CARBONDATA-2788][BloomDataMap] Fix b...

Posted by xuchuanyin <gi...@git.apache.org>.
Github user xuchuanyin commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2565#discussion_r205657632
  
    --- Diff: integration/spark2/src/main/scala/org/apache/carbondata/datamap/IndexDataMapRebuildRDD.scala ---
    @@ -357,13 +357,20 @@ class IndexDataMapRebuildRDD[K, V](
             // skip clear datamap and we will do this adter rebuild
             reader.setSkipClearDataMapAtClose(true)
     
    +        // currently blockletId in rowWithPosition is wrong, we cannot use it
    --- End diff --
    
    OK


---

[GitHub] carbondata issue #2565: [HotFix][CARBONDATA-2788][BloomDataMap] Fix bugs in ...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2565
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7580/



---

[GitHub] carbondata pull request #2565: [HotFix][CARBONDATA-2788][BloomDataMap] Fix b...

Posted by xuchuanyin <gi...@git.apache.org>.
Github user xuchuanyin commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2565#discussion_r206367396
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMapFactory.java ---
    @@ -71,7 +71,7 @@
       /**
        * variable for cache level BLOCKLET
        */
    -  private static final String CACHE_LEVEL_BLOCKLET = "BLOCKLET";
    +  public static final String CACHE_LEVEL_BLOCKLET = "BLOCKLET";
    --- End diff --
    
    Because this member needs to be accessed outside this class. Currently in `CarbonInputFormat` we need to use this variable to know the current cache level.


---

[GitHub] carbondata issue #2565: [HotFix][CARBONDATA-2788][BloomDataMap] Fix bugs in ...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2565
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6021/



---

[GitHub] carbondata issue #2565: [HotFix][CARBONDATA-2788][BloomDataMap] Fix bugs in ...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2565
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6336/



---

[GitHub] carbondata issue #2565: [HotFix][CARBONDATA-2788][BloomDataMap] Revert optim...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2565
  
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6292/



---

[GitHub] carbondata issue #2565: [HotFix][CARBONDATA-2788][BloomDataMap] Fix bugs in ...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2565
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6020/



---

[GitHub] carbondata issue #2565: [HotFix][CARBONDATA-2788][BloomDataMap] Fix bugs in ...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2565
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7573/



---

[GitHub] carbondata issue #2565: [HotFix][CARBONDATA-2788][BloomDataMap] Fix bugs in ...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2565
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7579/



---

[GitHub] carbondata issue #2565: [HotFix][CARBONDATA-2788][BloomDataMap] Fix bugs in ...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2565
  
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6332/



---

[GitHub] carbondata issue #2565: [HotFix][CARBONDATA-2788][BloomDataMap] Fix bugs in ...

Posted by xuchuanyin <gi...@git.apache.org>.
Github user xuchuanyin commented on the issue:

    https://github.com/apache/carbondata/pull/2565
  
    retest this please


---

[GitHub] carbondata issue #2565: [HotFix][CARBONDATA-2788][BloomDataMap] Fix bugs in ...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2565
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7567/



---

[GitHub] carbondata pull request #2565: [HotFix][CARBONDATA-2788][BloomDataMap] Fix b...

Posted by chenliang613 <gi...@git.apache.org>.
Github user chenliang613 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2565#discussion_r206181478
  
    --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/datamap/lucene/LuceneFineGrainDataMapSuite.scala ---
    @@ -371,7 +371,7 @@ class LuceneFineGrainDataMapSuite extends QueryTest with BeforeAndAfterAll {
           """
             | CREATE TABLE datamap_test_table(id INT, name STRING, city STRING, age INT)
             | STORED BY 'carbondata'
    -        | TBLPROPERTIES('SORT_COLUMNS'='city,name', 'SORT_SCOPE'='GLOBAL_SORT')
    +        | TBLPROPERTIES('SORT_COLUMNS'='city,name', 'SORT_SCOPE'='GLOBAL_SORT', 'CACHE_LEVEL'='BLOCKLET')
    --- End diff --
    
    why need change "CACHE_LEVEL" to "BLOCKLET"


---

[GitHub] carbondata issue #2565: [HotFix][CARBONDATA-2788][BloomDataMap] Fix bugs in ...

Posted by xuchuanyin <gi...@git.apache.org>.
Github user xuchuanyin commented on the issue:

    https://github.com/apache/carbondata/pull/2565
  
    Hi, all @jackylk @ravipesala @manishgupta88 @chenliang613 
    I raised another PR #2574 as another implementation for this PR, please check it also


---

[GitHub] carbondata issue #2565: WIP: [HotFix][CARBONDATA-2788][BloomDataMap] Fix bug...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2565
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7590/



---

[GitHub] carbondata issue #2565: [HotFix][CARBONDATA-2788][BloomDataMap] Fix bugs in ...

Posted by chenliang613 <gi...@git.apache.org>.
Github user chenliang613 commented on the issue:

    https://github.com/apache/carbondata/pull/2565
  
    retest this please


---

[GitHub] carbondata issue #2565: [HotFix][CARBONDATA-2788][BloomDataMap] Fix bugs in ...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2565
  
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6322/



---

[GitHub] carbondata pull request #2565: [HotFix][CARBONDATA-2788][BloomDataMap] Fix b...

Posted by manishgupta88 <gi...@git.apache.org>.
Github user manishgupta88 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2565#discussion_r205676555
  
    --- Diff: datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMap.java ---
    @@ -165,9 +180,11 @@ public void initIndexColumnConverters(CarbonTable carbonTable, List<CarbonColumn
           for (CarbonBloomFilter bloomFilter : bloomIndexList) {
             boolean scanRequired = bloomFilter.membershipTest(new Key(bloomQueryModel.filterValue));
             if (scanRequired) {
    +          String blockletNo =
    +              isBlockletCacheLevel ? String.valueOf(bloomFilter.getBlockletNo()) : "-1";
    --- End diff --
    
    I think Bloom dataMap should return the actual blocklet Id. The old behavior should not be modified based on cache level


---

[GitHub] carbondata issue #2565: [HotFix][CARBONDATA-2788][BloomDataMap] Fix bugs in ...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2565
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7602/



---

[GitHub] carbondata issue #2565: [HotFix][CARBONDATA-2788][BloomDataMap] Revert optim...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2565
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7538/



---

[GitHub] carbondata issue #2565: [HotFix][CARBONDATA-2788][BloomDataMap] Fix bugs in ...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2565
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7577/



---

[GitHub] carbondata pull request #2565: [HotFix][CARBONDATA-2788][BloomDataMap] Fix b...

Posted by xuchuanyin <gi...@git.apache.org>.
Github user xuchuanyin commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2565#discussion_r205657316
  
    --- Diff: datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMap.java ---
    @@ -103,7 +106,19 @@ public void init(DataMapModel dataMapModel) throws IOException {
       /**
        * init field converters for index columns
        */
    -  public void initIndexColumnConverters(CarbonTable carbonTable, List<CarbonColumn> indexedColumn) {
    +  public void initIndexColumnConverters(CarbonTable carbonTable, String dataMapName,
    +      List<CarbonColumn> indexedColumn) {
    +    String cacheLevel = MapUtils.getString(
    +        carbonTable.getTableInfo().getFactTable().getTableProperties(),
    +        CarbonCommonConstants.CACHE_LEVEL, CarbonCommonConstants.CACHE_LEVEL_DEFAULT_VALUE);
    +    this.isBlockletCacheLevel = cacheLevel.equalsIgnoreCase("blocklet");
    +    if (!this.isBlockletCacheLevel) {
    +      LOGGER.warn(
    +          String.format("BloomFilter datamap %s runs with cache_level=block for table %s.%s,"
    +              + " which may decrease its pruning performance",
    --- End diff --
    
    OK


---

[GitHub] carbondata issue #2565: [HotFix][CARBONDATA-2788][BloomDataMap] Fix bugs in ...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on the issue:

    https://github.com/apache/carbondata/pull/2565
  
    LGTM


---

[GitHub] carbondata issue #2565: [HotFix][CARBONDATA-2788][BloomDataMap] Fix bugs in ...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2565
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7581/



---

[GitHub] carbondata issue #2565: [HotFix][CARBONDATA-2788][BloomDataMap] Fix bugs in ...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2565
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6042/



---

[GitHub] carbondata issue #2565: WIP: [HotFix][CARBONDATA-2788][BloomDataMap] Fix bug...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2565
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6345/



---

[GitHub] carbondata pull request #2565: WIP: [HotFix][CARBONDATA-2788][BloomDataMap] ...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2565#discussion_r206008805
  
    --- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonInputFormat.java ---
    @@ -492,6 +493,26 @@ protected Expression getFilterPredicates(Configuration configuration) {
         return prunedBlocklets;
       }
     
    +  private List<ExtendedBlocklet> intersectFilteredBlocklets(CarbonTable carbonTable,
    +      List<ExtendedBlocklet> previousDataMapPrunedBlocklets,
    +      List<ExtendedBlocklet> otherDataMapPrunedBlocklets) {
    +    List<ExtendedBlocklet> prunedBlocklets = null;
    +    if (BlockletDataMapUtil
    +        .isCacheLevelBlock(carbonTable, BlockletDataMapFactory.CACHE_LEVEL_BLOCKLET)) {
    +      prunedBlocklets = new ArrayList<>(otherDataMapPrunedBlocklets);
    +      // add blocklets from previous dataMap that are not filtered by other dataMaps
    +      for (ExtendedBlocklet previousBlocklet : previousDataMapPrunedBlocklets) {
    +        if (!otherDataMapPrunedBlocklets.contains(previousBlocklet)) {
    +          prunedBlocklets.add(previousBlocklet);
    --- End diff --
    
    It supposed to be same as with blocklet level. why adding extra non pruned blocklets should be added to the list? Any use case for this check?


---

[GitHub] carbondata issue #2565: [HotFix][CARBONDATA-2788][BloomDataMap] Fix bugs in ...

Posted by brijoobopanna <gi...@git.apache.org>.
Github user brijoobopanna commented on the issue:

    https://github.com/apache/carbondata/pull/2565
  
    retest this please


---

[GitHub] carbondata pull request #2565: [HotFix][CARBONDATA-2788][BloomDataMap] Fix b...

Posted by chenliang613 <gi...@git.apache.org>.
Github user chenliang613 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2565#discussion_r206178641
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMapFactory.java ---
    @@ -71,7 +71,7 @@
       /**
        * variable for cache level BLOCKLET
        */
    -  private static final String CACHE_LEVEL_BLOCKLET = "BLOCKLET";
    +  public static final String CACHE_LEVEL_BLOCKLET = "BLOCKLET";
    --- End diff --
    
    why changes to public


---

[GitHub] carbondata issue #2565: [HotFix][CARBONDATA-2788][BloomDataMap] Fix bugs in ...

Posted by brijoobopanna <gi...@git.apache.org>.
Github user brijoobopanna commented on the issue:

    https://github.com/apache/carbondata/pull/2565
  
    retest sdv please



---

[GitHub] carbondata issue #2565: WIP: [HotFix][CARBONDATA-2788][BloomDataMap] Fix bug...

Posted by xuchuanyin <gi...@git.apache.org>.
Github user xuchuanyin commented on the issue:

    https://github.com/apache/carbondata/pull/2565
  
    PR2574 is to replace this PR


---

[GitHub] carbondata issue #2565: [HotFix][CARBONDATA-2788][BloomDataMap] Fix bugs in ...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2565
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7545/



---

[GitHub] carbondata pull request #2565: [HotFix][CARBONDATA-2788][BloomDataMap] Fix b...

Posted by xuchuanyin <gi...@git.apache.org>.
Github user xuchuanyin commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2565#discussion_r206040947
  
    --- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonInputFormat.java ---
    @@ -492,6 +493,26 @@ protected Expression getFilterPredicates(Configuration configuration) {
         return prunedBlocklets;
       }
     
    +  private List<ExtendedBlocklet> intersectFilteredBlocklets(CarbonTable carbonTable,
    +      List<ExtendedBlocklet> previousDataMapPrunedBlocklets,
    +      List<ExtendedBlocklet> otherDataMapPrunedBlocklets) {
    +    List<ExtendedBlocklet> prunedBlocklets = null;
    +    if (BlockletDataMapUtil
    +        .isCacheLevelBlock(carbonTable, BlockletDataMapFactory.CACHE_LEVEL_BLOCKLET)) {
    +      prunedBlocklets = new ArrayList<>(otherDataMapPrunedBlocklets);
    +      // add blocklets from previous dataMap that are not filtered by other dataMaps
    +      for (ExtendedBlocklet previousBlocklet : previousDataMapPrunedBlocklets) {
    +        if (!otherDataMapPrunedBlocklets.contains(previousBlocklet)) {
    +          prunedBlocklets.add(previousBlocklet);
    --- End diff --
    
    Fixed as we talked


---

[GitHub] carbondata issue #2565: [HotFix][CARBONDATA-2788][BloomDataMap] Fix bugs in ...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2565
  
    retest this please


---

[GitHub] carbondata pull request #2565: [HotFix][CARBONDATA-2788][BloomDataMap] Fix b...

Posted by chenliang613 <gi...@git.apache.org>.
Github user chenliang613 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2565#discussion_r206178902
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockDataMap.java ---
    @@ -318,13 +318,22 @@ private DataMapRowImpl loadBlockMetaInfo(CarbonRowSchema[] taskSummarySchema,
                   blockMinValues, blockMaxValues);
           blockletCountInEachBlock.add(totalBlockletsInOneBlock);
         }
    -    byte[] blockletCount = ArrayUtils
    -        .toPrimitive(blockletCountInEachBlock.toArray(new Byte[blockletCountInEachBlock.size()]));
    +    byte[] blockletCount = convertRowCountFromShortToByteArray(blockletCountInEachBlock);
         // blocklet count index is the last index
         summaryRow.setByteArray(blockletCount, taskSummarySchema.length - 1);
         return summaryRow;
       }
     
    +  private byte[] convertRowCountFromShortToByteArray(List<Short> blockletCountInEachBlock) {
    --- End diff --
    
    why need to do the convert ?


---

[GitHub] carbondata pull request #2565: [HotFix][CARBONDATA-2788][BloomDataMap] Fix b...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/carbondata/pull/2565


---

[GitHub] carbondata pull request #2565: [HotFix][CARBONDATA-2788][BloomDataMap] Fix b...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2565#discussion_r205656230
  
    --- Diff: integration/spark2/src/main/scala/org/apache/carbondata/datamap/IndexDataMapRebuildRDD.scala ---
    @@ -357,13 +357,20 @@ class IndexDataMapRebuildRDD[K, V](
             // skip clear datamap and we will do this adter rebuild
             reader.setSkipClearDataMapAtClose(true)
     
    +        // currently blockletId in rowWithPosition is wrong, we cannot use it
    --- End diff --
    
    This is a bit confusing, can you rephrase it


---

[GitHub] carbondata issue #2565: [HotFix][CARBONDATA-2788][BloomDataMap] Fix bugs in ...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2565
  
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6327/



---

[GitHub] carbondata issue #2565: [HotFix][CARBONDATA-2788][BloomDataMap] Fix bugs in ...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2565
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7621/



---

[GitHub] carbondata issue #2565: [HotFix][CARBONDATA-2788][BloomDataMap] Fix bugs in ...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2565
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6299/



---

[GitHub] carbondata issue #2565: [HotFix][CARBONDATA-2788][BloomDataMap] Fix bugs in ...

Posted by brijoobopanna <gi...@git.apache.org>.
Github user brijoobopanna commented on the issue:

    https://github.com/apache/carbondata/pull/2565
  
    retest this please



---

[GitHub] carbondata issue #2565: [HotFix][CARBONDATA-2788][BloomDataMap] Fix bugs in ...

Posted by xuchuanyin <gi...@git.apache.org>.
Github user xuchuanyin commented on the issue:

    https://github.com/apache/carbondata/pull/2565
  
    Till now, there is still a problem: for the test case added in Bloom*FunctionSuite, the explain query output will give negative pruned result like:
    ```
    |== CarbonData Profiler ==
    Table Scan on test_rcd
     - total blocklets: 1
     - filter: (city <> null and city = city40)
     - pruned by Main DataMap
        - skipped blocklets: 0
     - pruned by CG DataMap
        - name: dm_rcd
        - provider: bloomfilter
        - skipped blocklets: -1
                                        |
    |== Physical Plan ==
    *FileScan carbondata default.test_rcd[id#172,country#173,city#174,population#175,random1#176,random2#177,random3#178,random4#179,random5#180,random6#181,random7#182,random8#183,random9#184,random10#185,random11#186,random12#187] PushedFilters: [IsNotNull(city), EqualTo(city,city40)]|
    ```


---

[GitHub] carbondata issue #2565: [HotFix][CARBONDATA-2788][BloomDataMap] Fix bugs in ...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2565
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6365/



---

[GitHub] carbondata issue #2565: [HotFix][CARBONDATA-2788][BloomDataMap] Fix bugs in ...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2565
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7575/



---

[GitHub] carbondata pull request #2565: [HotFix][CARBONDATA-2788][BloomDataMap] Fix b...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2565#discussion_r205655699
  
    --- Diff: datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMap.java ---
    @@ -103,7 +106,19 @@ public void init(DataMapModel dataMapModel) throws IOException {
       /**
        * init field converters for index columns
        */
    -  public void initIndexColumnConverters(CarbonTable carbonTable, List<CarbonColumn> indexedColumn) {
    +  public void initIndexColumnConverters(CarbonTable carbonTable, String dataMapName,
    +      List<CarbonColumn> indexedColumn) {
    +    String cacheLevel = MapUtils.getString(
    +        carbonTable.getTableInfo().getFactTable().getTableProperties(),
    +        CarbonCommonConstants.CACHE_LEVEL, CarbonCommonConstants.CACHE_LEVEL_DEFAULT_VALUE);
    +    this.isBlockletCacheLevel = cacheLevel.equalsIgnoreCase("blocklet");
    +    if (!this.isBlockletCacheLevel) {
    +      LOGGER.warn(
    +          String.format("BloomFilter datamap %s runs with cache_level=block for table %s.%s,"
    +              + " which may decrease its pruning performance",
    --- End diff --
    
    change to `which may decrease its pruning benefit, which lead to read more data`


---

[GitHub] carbondata issue #2565: [HotFix][CARBONDATA-2788][BloomDataMap] Fix bugs in ...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2565
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6035/



---

[GitHub] carbondata issue #2565: [HotFix][CARBONDATA-2788][BloomDataMap] Fix bugs in ...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2565
  
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6297/



---

[GitHub] carbondata issue #2565: [HotFix][CARBONDATA-2788][BloomDataMap] Fix bugs in ...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2565
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6040/



---

[GitHub] carbondata issue #2565: [HotFix][CARBONDATA-2788][BloomDataMap] Fix bugs in ...

Posted by chenliang613 <gi...@git.apache.org>.
Github user chenliang613 commented on the issue:

    https://github.com/apache/carbondata/pull/2565
  
    retest this please


---

[GitHub] carbondata issue #2565: [HotFix][CARBONDATA-2788][BloomDataMap] Fix bugs in ...

Posted by chenliang613 <gi...@git.apache.org>.
Github user chenliang613 commented on the issue:

    https://github.com/apache/carbondata/pull/2565
  
    i tested at my machine ,  the PR is working fine for query data with bloomfilter.


---

[GitHub] carbondata issue #2565: [HotFix][CARBONDATA-2788][BloomDataMap] Fix bugs in ...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2565
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7543/



---

[GitHub] carbondata issue #2565: [HotFix][CARBONDATA-2788][BloomDataMap] Fix bugs in ...

Posted by xuchuanyin <gi...@git.apache.org>.
Github user xuchuanyin commented on the issue:

    https://github.com/apache/carbondata/pull/2565
  
    As for the incorrect explain output, I raised a jira 2797 to track this.


---

[GitHub] carbondata issue #2565: [HotFix][CARBONDATA-2788][BloomDataMap] Fix bugs in ...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2565
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6039/



---

[GitHub] carbondata issue #2565: [HotFix][CARBONDATA-2788][BloomDataMap] Fix bugs in ...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2565
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6041/



---

[GitHub] carbondata issue #2565: [HotFix][CARBONDATA-2788][BloomDataMap] Fix bugs in ...

Posted by manishgupta88 <gi...@git.apache.org>.
Github user manishgupta88 commented on the issue:

    https://github.com/apache/carbondata/pull/2565
  
    LGTM


---

[GitHub] carbondata issue #2565: [HotFix][CARBONDATA-2788][BloomDataMap] Fix bugs in ...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2565
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7640/



---