You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@carbondata.apache.org by kevinjmh <gi...@git.apache.org> on 2018/11/05 12:31:21 UTC

[GitHub] carbondata pull request #2900: [CARBONDATA-3078] Disable explain collector f...

GitHub user kevinjmh opened a pull request:

    https://github.com/apache/carbondata/pull/2900

    [CARBONDATA-3078] Disable explain collector for count star query without filter

    An issue is found about count star query without filter in explain command. It is a special case. It uses different plan. 
    Considering
    no useful information about block/blocklet pruning for count star query without filter, so disable explain collector and avoid the exception in https://issues.apache.org/jira/browse/CARBONDATA-3078
    
    Be sure to do all of the following checklist to help us incorporate 
    your contribution quickly and easily:
    
     - [ ] Any interfaces changed?
     
     - [ ] Any backward compatibility impacted?
     
     - [ ] Document update required?
    
     - [ ] Testing done
            Please provide details on 
            - Whether new unit test cases have been added or why no new tests are required?
            - How it is tested? Please attach test report.
            - Is it a performance related change? Please attach the performance test report.
            - Any additional information to help reviewers in testing this change.
           
     - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. 
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/kevinjmh/carbondata explain_countstar

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/2900.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2900
    
----
commit 46f07a25924e98d43dd9dcfad3a1c7cb0cd4d895
Author: Manhua <ke...@...>
Date:   2018-11-05T12:17:59Z

    disable explain collector for count star query without filter

----


---

[GitHub] carbondata issue #2900: [CARBONDATA-3078] Disable explain collector for coun...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2900
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1276/



---

[GitHub] carbondata issue #2900: [CARBONDATA-3078] Disable explain collector for coun...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2900
  
    Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9555/



---

[GitHub] carbondata pull request #2900: [CARBONDATA-3078] Disable explain collector f...

Posted by xuchuanyin <gi...@git.apache.org>.
Github user xuchuanyin commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2900#discussion_r230970553
  
    --- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java ---
    @@ -575,6 +576,8 @@ private BitSet setMatchedPartitions(String partitionIds, Expression filter,
        */
       public BlockMappingVO getBlockRowCount(Job job, CarbonTable table,
           List<PartitionSpec> partitions) throws IOException {
    +    // no useful information for count star query without filter, so disable explain collector
    +    ExplainCollector.remove();
    --- End diff --
    
    I think this modification just try to avoid the problem but don't actually solve the problem.
    Can you explain what is the root cause of that problem?


---

[GitHub] carbondata issue #2900: [CARBONDATA-3078] Disable explain collector for coun...

Posted by xuchuanyin <gi...@git.apache.org>.
Github user xuchuanyin commented on the issue:

    https://github.com/apache/carbondata/pull/2900
  
    LGTM


---

[GitHub] carbondata issue #2900: [CARBONDATA-3078] Disable explain collector for coun...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2900
  
    Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1519/



---

[GitHub] carbondata pull request #2900: [CARBONDATA-3078] Disable explain collector f...

Posted by xuchuanyin <gi...@git.apache.org>.
Github user xuchuanyin commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2900#discussion_r231010174
  
    --- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java ---
    @@ -575,6 +576,8 @@ private BitSet setMatchedPartitions(String partitionIds, Expression filter,
        */
       public BlockMappingVO getBlockRowCount(Job job, CarbonTable table,
           List<PartitionSpec> partitions) throws IOException {
    +    // no useful information for count star query without filter, so disable explain collector
    +    ExplainCollector.remove();
    --- End diff --
    
    That's the weird part -- We are trying to remove the pruning collector even the pruning info is not initialized.
    I think you can add a flag for the collector to identify whether it is initialized. And this flag is used where carbon what to record the info. If you are planing to work like this, please add a comment for the scenario of this variable.


---

[GitHub] carbondata pull request #2900: [CARBONDATA-3078] Disable explain collector f...

Posted by kevinjmh <gi...@git.apache.org>.
Github user kevinjmh commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2900#discussion_r230972236
  
    --- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java ---
    @@ -575,6 +576,8 @@ private BitSet setMatchedPartitions(String partitionIds, Expression filter,
        */
       public BlockMappingVO getBlockRowCount(Job job, CarbonTable table,
           List<PartitionSpec> partitions) throws IOException {
    +    // no useful information for count star query without filter, so disable explain collector
    +    ExplainCollector.remove();
    --- End diff --
    
    You are right. 
    Normal query flow goes to `CarbonInputFormat#getPrunedBlocklets` and initialize the pruning info for table we queried.  But count star query without filter use a different query plan, it does not go into that method, so no pruning info does not init. When it calls default data map to prune(with a null filter), exception will occur during settingg pruning info.
    
    One solution is to init the pruning info for this type of query in mrthod `getBlockRowCount`. But considering
    no useful information about block/blocklet pruning for such query(actually no pruning), I choose to disable the expalin collector instead.


---

[GitHub] carbondata issue #2900: [CARBONDATA-3078] Disable explain collector for coun...

Posted by kevinjmh <gi...@git.apache.org>.
Github user kevinjmh commented on the issue:

    https://github.com/apache/carbondata/pull/2900
  
    retest this please


---

[GitHub] carbondata issue #2900: [CARBONDATA-3078] Disable explain collector for coun...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2900
  
    Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1507/



---

[GitHub] carbondata issue #2900: [CARBONDATA-3078] Disable explain collector for coun...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2900
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1294/



---

[GitHub] carbondata issue #2900: [CARBONDATA-3078] Disable explain collector for coun...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2900
  
    Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9568/



---

[GitHub] carbondata issue #2900: [CARBONDATA-3078] Disable explain collector for coun...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2900
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1304/



---

[GitHub] carbondata pull request #2900: [CARBONDATA-3078] Disable explain collector f...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/carbondata/pull/2900


---

[GitHub] carbondata pull request #2900: [CARBONDATA-3078] Disable explain collector f...

Posted by kevinjmh <gi...@git.apache.org>.
Github user kevinjmh commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2900#discussion_r231010531
  
    --- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java ---
    @@ -575,6 +576,8 @@ private BitSet setMatchedPartitions(String partitionIds, Expression filter,
        */
       public BlockMappingVO getBlockRowCount(Job job, CarbonTable table,
           List<PartitionSpec> partitions) throws IOException {
    +    // no useful information for count star query without filter, so disable explain collector
    +    ExplainCollector.remove();
    --- End diff --
    
    No remove. Its implementation is disable.


---

[GitHub] carbondata issue #2900: [CARBONDATA-3078] Disable explain collector for coun...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2900
  
    Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1495/



---

[GitHub] carbondata pull request #2900: [CARBONDATA-3078] Disable explain collector f...

Posted by xuchuanyin <gi...@git.apache.org>.
Github user xuchuanyin commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2900#discussion_r231012917
  
    --- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java ---
    @@ -575,6 +576,8 @@ private BitSet setMatchedPartitions(String partitionIds, Expression filter,
        */
       public BlockMappingVO getBlockRowCount(Job job, CarbonTable table,
           List<PartitionSpec> partitions) throws IOException {
    +    // no useful information for count star query without filter, so disable explain collector
    +    ExplainCollector.remove();
    --- End diff --
    
    please add comments for your modification in the code for better understanding


---

[GitHub] carbondata pull request #2900: [CARBONDATA-3078] Disable explain collector f...

Posted by kevinjmh <gi...@git.apache.org>.
Github user kevinjmh commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2900#discussion_r231014871
  
    --- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java ---
    @@ -575,6 +576,8 @@ private BitSet setMatchedPartitions(String partitionIds, Expression filter,
        */
       public BlockMappingVO getBlockRowCount(Job job, CarbonTable table,
           List<PartitionSpec> partitions) throws IOException {
    +    // no useful information for count star query without filter, so disable explain collector
    +    ExplainCollector.remove();
    --- End diff --
    
    OK


---

[GitHub] carbondata issue #2900: [CARBONDATA-3078] Disable explain collector for coun...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2900
  
    Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9545/



---

[GitHub] carbondata pull request #2900: [CARBONDATA-3078] Disable explain collector f...

Posted by xuchuanyin <gi...@git.apache.org>.
Github user xuchuanyin commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2900#discussion_r231012866
  
    --- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java ---
    @@ -575,6 +576,8 @@ private BitSet setMatchedPartitions(String partitionIds, Expression filter,
        */
       public BlockMappingVO getBlockRowCount(Job job, CarbonTable table,
           List<PartitionSpec> partitions) throws IOException {
    +    // no useful information for count star query without filter, so disable explain collector
    +    ExplainCollector.remove();
    --- End diff --
    
    oh... I understand.
    The current implementation of pruning collector may has bugs. Based on the current implementation, your modification is OK...


---