You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@carbondata.apache.org by manishgupta88 <gi...@git.apache.org> on 2018/07/19 14:02:57 UTC

[GitHub] carbondata pull request #2531: [HOTFIX] Improved BlockDataMap caching perfor...

GitHub user manishgupta88 opened a pull request:

    https://github.com/apache/carbondata/pull/2531

    [HOTFIX] Improved BlockDataMap caching performance during first time query

    Things done as part of this PR
    1. Created taskSumamry and FileFooterEntry schema once and stored in member variable. Everytime creation of schema was a costly operation as time to prune dataMaps increased because of that.
    2. Used TreeMap instead of HashMap while adding the complete file path and data to the map diring merge file read. Using TreeMap improved the map filling performance by 10 sec for 1200 entries.
    
     - [ ] Any interfaces changed?
     No
     - [ ] Any backward compatibility impacted?
     No
     - [ ] Document update required?
    No
     - [ ] Testing done
    Verified manually       
     - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. 
    NA


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/manishgupta88/carbondata query_perf

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/2531.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2531
    
----
commit 26954b88d606535349f83f80a3e00f9b2db4fd66
Author: manishgupta88 <to...@...>
Date:   2018-07-19T13:45:12Z

    Code modification done to improve query performance

----


---

[GitHub] carbondata issue #2531: [HOTFIX] Improved BlockDataMap caching performance d...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2531
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7357/



---

[GitHub] carbondata issue #2531: [WIP] [HOTFIX] Improved BlockDataMap caching perform...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2531
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7325/



---

[GitHub] carbondata issue #2531: [HOTFIX] Improved BlockDataMap caching performance d...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2531
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5943/



---

[GitHub] carbondata issue #2531: [HOTFIX] Improved BlockDataMap caching performance d...

Posted by brijoobopanna <gi...@git.apache.org>.
Github user brijoobopanna commented on the issue:

    https://github.com/apache/carbondata/pull/2531
  
    retest sdv please


---

[GitHub] carbondata pull request #2531: [HOTFIX] Improved BlockDataMap caching perfor...

Posted by kumarvishal09 <gi...@git.apache.org>.
Github user kumarvishal09 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2531#discussion_r203999922
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/block/SegmentPropertiesAndSchemaHolder.java ---
    @@ -237,6 +241,32 @@ public void invalidate(String segmentId, int segmentPropertiesIndex,
               .isEmpty()) {
             indexToSegmentPropertiesWrapperMapping.remove(segmentPropertiesIndex);
             segmentPropWrapperToSegmentSetMap.remove(segmentPropertiesWrapper);
    +      } else if (!clearSegmentWrapperFromMap
    +          && segmentIdAndSegmentPropertiesIndexWrapper.segmentIdSet.isEmpty()) {
    +        // min max columns can very when cache is modified. So even though entry is not required
    +        // to be deleted from map clear the column cache so that it can filled again
    +        segmentPropertiesWrapper.clear();
    +        LOGGER.info("cleared min max for segmentProperties at index: " + segmentPropertiesIndex);
    +      }
    +    }
    +  }
    +
    +  /**
    +   * add segmentId at given segmentPropertyIndex
    +   * Note: This method is getting used in extension with other features. Please do not remove
    +   *
    +   * @param segmentPropertiesIndex
    +   * @param segmentId
    +   */
    +  public void addSegmentId(int segmentPropertiesIndex, String segmentId) {
    +    SegmentPropertiesWrapper segmentPropertiesWrapper =
    +        indexToSegmentPropertiesWrapperMapping.get(segmentPropertiesIndex);
    +    if (null != segmentPropertiesWrapper) {
    +      SegmentIdAndSegmentPropertiesIndexWrapper segmentIdAndSegmentPropertiesIndexWrapper =
    +          segmentPropWrapperToSegmentSetMap.get(segmentPropertiesWrapper);
    +      synchronized (segmentPropertiesWrapper.getTableIdentifier().getCarbonTableIdentifier()
    --- End diff --
    
    Use  getOrCreateTableLock 


---

[GitHub] carbondata issue #2531: [HOTFIX] Improved BlockDataMap caching performance d...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2531
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7354/



---

[GitHub] carbondata issue #2531: [HOTFIX] Improved BlockDataMap caching performance d...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2531
  
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6086/



---

[GitHub] carbondata issue #2531: [HOTFIX] Improved BlockDataMap caching performance d...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2531
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6108/



---

[GitHub] carbondata issue #2531: [HOTFIX] Improved BlockDataMap caching performance d...

Posted by brijoobopanna <gi...@git.apache.org>.
Github user brijoobopanna commented on the issue:

    https://github.com/apache/carbondata/pull/2531
  
    retest sdv please



---

[GitHub] carbondata issue #2531: [WIP] [HOTFIX] Improved BlockDataMap caching perform...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2531
  
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6092/



---

[GitHub] carbondata issue #2531: [HOTFIX] Improved BlockDataMap caching performance d...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2531
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6120/



---

[GitHub] carbondata issue #2531: [HOTFIX] Improved BlockDataMap caching performance d...

Posted by kumarvishal09 <gi...@git.apache.org>.
Github user kumarvishal09 commented on the issue:

    https://github.com/apache/carbondata/pull/2531
  
    LGTM


---

[GitHub] carbondata issue #2531: [WIP] [HOTFIX] Improved BlockDataMap caching perform...

Posted by manishgupta88 <gi...@git.apache.org>.
Github user manishgupta88 commented on the issue:

    https://github.com/apache/carbondata/pull/2531
  
    retest this please


---

[GitHub] carbondata issue #2531: [WIP] [HOTFIX] Improved BlockDataMap caching perform...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2531
  
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6099/



---

[GitHub] carbondata issue #2531: [WIP] [HOTFIX] Improved BlockDataMap caching perform...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2531
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7334/



---

[GitHub] carbondata issue #2531: [HOTFIX] Improved BlockDataMap caching performance d...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2531
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5931/



---

[GitHub] carbondata pull request #2531: [HOTFIX] Improved BlockDataMap caching perfor...

Posted by manishgupta88 <gi...@git.apache.org>.
Github user manishgupta88 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2531#discussion_r204000404
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/block/SegmentPropertiesAndSchemaHolder.java ---
    @@ -237,6 +241,32 @@ public void invalidate(String segmentId, int segmentPropertiesIndex,
               .isEmpty()) {
             indexToSegmentPropertiesWrapperMapping.remove(segmentPropertiesIndex);
             segmentPropWrapperToSegmentSetMap.remove(segmentPropertiesWrapper);
    +      } else if (!clearSegmentWrapperFromMap
    +          && segmentIdAndSegmentPropertiesIndexWrapper.segmentIdSet.isEmpty()) {
    +        // min max columns can very when cache is modified. So even though entry is not required
    +        // to be deleted from map clear the column cache so that it can filled again
    +        segmentPropertiesWrapper.clear();
    +        LOGGER.info("cleared min max for segmentProperties at index: " + segmentPropertiesIndex);
    +      }
    +    }
    +  }
    +
    +  /**
    +   * add segmentId at given segmentPropertyIndex
    +   * Note: This method is getting used in extension with other features. Please do not remove
    +   *
    +   * @param segmentPropertiesIndex
    +   * @param segmentId
    +   */
    +  public void addSegmentId(int segmentPropertiesIndex, String segmentId) {
    +    SegmentPropertiesWrapper segmentPropertiesWrapper =
    +        indexToSegmentPropertiesWrapperMapping.get(segmentPropertiesIndex);
    +    if (null != segmentPropertiesWrapper) {
    +      SegmentIdAndSegmentPropertiesIndexWrapper segmentIdAndSegmentPropertiesIndexWrapper =
    +          segmentPropWrapperToSegmentSetMap.get(segmentPropertiesWrapper);
    +      synchronized (segmentPropertiesWrapper.getTableIdentifier().getCarbonTableIdentifier()
    --- End diff --
    
    ok


---

[GitHub] carbondata issue #2531: [HOTFIX] Improved BlockDataMap caching performance d...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2531
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5946/



---

[GitHub] carbondata pull request #2531: [HOTFIX] Improved BlockDataMap caching perfor...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/carbondata/pull/2531


---