You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@carbondata.apache.org by watermen <gi...@git.apache.org> on 2017/03/16 02:38:28 UTC

[GitHub] incubator-carbondata pull request #659: Reuse the same SegmentProperties obj...

GitHub user watermen opened a pull request:

    https://github.com/apache/incubator-carbondata/pull/659

    Reuse the same SegmentProperties objects to reduce the memory

    When I load carbondata 1000+ times with 35 nodes, I found SegmentProperties occupy 2.5+G(76K * 35 * 1000) memory in driver. 
    ![carbonproperties](https://cloud.githubusercontent.com/assets/1400819/23979443/82d44320-0a34-11e7-9a5b-c4dcab4f9232.jpg)
    I don't have small files so I don't want to compact the segments. I analyzed the dump file and found the values of SegmentProperties are the same, so I think we can reuse the SegmentProperties object if possible.
    
    cc @jackylk @QiangCai 

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/watermen/incubator-carbondata CARBONDATA-781

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-carbondata/pull/659.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #659
    
----
commit b82e26907bd6882399cd4084e7584379b10c934c
Author: Yadong Qi <qi...@gmail.com>
Date:   2017-03-15T09:41:42Z

    Reuse the same SegmentProperties to reduce the memory.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata pull request #659: [CARBONDATA-781] Reuse SegmentProper...

Posted by watermen <gi...@git.apache.org>.
Github user watermen commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/659#discussion_r112799728
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/SegmentTaskIndexStore.java ---
    @@ -396,4 +419,34 @@ public TaskBucketHolder(String taskNo, String bucketNumber) {
           return result;
         }
       }
    +
    +  public static class SegmentPropertiesWrapper {
    --- End diff --
    
    @jackylk Added.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #659: [CARBONDATA-781] Store one SegmentPropertie...

Posted by watermen <gi...@git.apache.org>.
Github user watermen commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/659
  
    @QiangCai When we query the big table, the memory pressure in driver side is greater than executor side. So I think we can first reuse the segment properties in driver side in this pr, and do the reuse in executor side in another pr.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #659: [CARBONDATA-781] Reuse the same SegmentProp...

Posted by watermen <gi...@git.apache.org>.
Github user watermen commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/659
  
    @jackylk If I want to reuse `SegmentProperties` and only check the parameters `columnsInTable` and `columnCardinality`, which case I will missing(wrong case)? Because compare `SegmentProperties` objects are easier than `CarbonDimension` and `CarbonMeasure` objects.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #659: [CARBONDATA-781] Store one SegmentPropertie...

Posted by kumarvishal09 <gi...@git.apache.org>.
Github user kumarvishal09 commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/659
  
    @watermen Attached heap dump is of executor side btree or driver side? 
    1. Because in driver side btree is loaded based on segment and one segment will have only one segment property instance. 
    2. In executor side carbon is loading segment property for each block(carbon data file), so number of segment properties instance will be more in executor side. 
    
    In my opinion we need to optimise in executor side btree loading , for one segment across blocks there should be only one segment property instance



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #659: [CARBONDATA-781] Reuse SegmentProperties ob...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/659
  
    Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1759/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #659: [CARBONDATA-781] Store one SegmentPropertie...

Posted by watermen <gi...@git.apache.org>.
Github user watermen commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/659
  
    @jackylk @QiangCai I have already modified code with "Store one SegmentProperties object each segment" solution.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #659: [CARBONDATA-781] Reuse the same SegmentProp...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/659
  
    Build Failed  with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1168/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #659: [CARBONDATA-781] Reuse the same SegmentProp...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/659
  
    Build Failed  with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1200/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #659: [CARBONDATA-781] Store one SegmentPropertie...

Posted by kumarvishal09 <gi...@git.apache.org>.
Github user kumarvishal09 commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/659
  
    @watermen @QiangCai 
    1. In driver side segment is loaded based on taskid for each segment ...here we across task id for same segment we can create load only one segment properties ...and across segments if carbon table schema is same and cardinality of dictionary column is same we can reuse the same segment properties 
    2. In executor side across task for same executor if segment properties is same we can try to reuse the same segment properties 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #659: [CARBONDATA-781] Store one SegmentPropertie...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/659
  
    Build Failed  with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1726/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata pull request #659: [CARBONDATA-781] Store one SegmentPr...

Posted by QiangCai <gi...@git.apache.org>.
Github user QiangCai commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/659#discussion_r108033699
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/block/SegmentTaskIndex.java ---
    @@ -16,30 +16,52 @@
      */
     package org.apache.carbondata.core.datastore.block;
     
    +import java.util.HashMap;
     import java.util.List;
    +import java.util.Map;
     
     import org.apache.carbondata.core.datastore.BTreeBuilderInfo;
     import org.apache.carbondata.core.datastore.BtreeBuilder;
     import org.apache.carbondata.core.datastore.impl.btree.BlockBTreeBuilder;
    +import org.apache.carbondata.core.metadata.AbsoluteTableIdentifier;
     import org.apache.carbondata.core.metadata.blocklet.DataFileFooter;
     
     /**
      * Class which is responsible for loading the b+ tree block. This class will
      * persist all the detail of a table segment
      */
     public class SegmentTaskIndex extends AbstractIndex {
    +  private static Map<SegmentKey, SegmentProperties> segmentPropertiesCached =
    --- End diff --
    
    why not use TableSegmentUniqueIdentifier?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #659: [CARBONDATA-781] Reuse the same SegmentProp...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/659
  
    Build Failed  with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1186/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata pull request #659: [CARBONDATA-781] Reuse SegmentProper...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/incubator-carbondata/pull/659


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #659: [CARBONDATA-781] Reuse the same SegmentProp...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/659
  
    Build Failed  with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1176/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #659: [CARBONDATA-781] Reuse SegmentProperties ob...

Posted by watermen <gi...@git.apache.org>.
Github user watermen commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/659
  
    @jackylk @kumarvishal09 Can you review the code again?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #659: [CARBONDATA-781] Store one SegmentPropertie...

Posted by watermen <gi...@git.apache.org>.
Github user watermen commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/659
  
    @kumarvishal09 
    1. In the driver side, one segment has n(the number of nodes) SegmentProperties objects. You can see the `SegmentTaskIndexStore.loadBlocks` or ask @QiangCai for detail.
    2. In the executor side, we can't just remain one SegmentProperties object each segment, because tasks in same segment will run on the different executor, means different progress.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #659: [CARBONDATA-781] Reuse SegmentProperties ob...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/659
  
    LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #659: [CARBONDATA-781] Reuse the same SegmentProp...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/659
  
    I checked recently merged PR641 for restructuring, it seems that this is still used for maintaining metadata for each Segment, so it can\u2019t be changed to TableProperties.
    One suggestion I have is to abstract and re-use some of the big object inside SegmentProperties, like the CarbonDimension and CarbonMeasure objects. They should be cached and get by ID, then it can be re-used across Segments.
    
    
    \u53d1\u4ef6\u4eba: Yadong Qi [mailto:notifications@github.com]
    \u53d1\u9001\u65f6\u95f4: 2017\u5e743\u670817\u65e5 10:00
    \u6536\u4ef6\u4eba: apache/incubator-carbondata
    \u6284\u9001: Likun (Jacky); Mention
    \u4e3b\u9898: Re: [apache/incubator-carbondata] [CARBONDATA-781] Reuse the same SegmentProperties objects to reduce the memory (#659)
    
    
    @jackylk<https://github.com/jackylk> You mean now we can store properties in table level(Maybe called TableProperties) insteads of SegmentProperties?
    
    \u2014
    You are receiving this because you were mentioned.
    Reply to this email directly, view it on GitHub<https://github.com/apache/incubator-carbondata/pull/659#issuecomment-287245903>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AGMxWr0HGumNSjzq7T693HG7dYLaiLN3ks5rmekugaJpZM4MexjG>.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #659: [CARBONDATA-781] Reuse SegmentProperties ob...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/659
  
    Build Failed  with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1734/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata pull request #659: [CARBONDATA-781] Store one SegmentPr...

Posted by QiangCai <gi...@git.apache.org>.
Github user QiangCai commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/659#discussion_r108146058
  
    --- Diff: core/src/test/java/org/apache/carbondata/core/datastore/block/SegmentTaskIndexTest.java ---
    @@ -58,7 +58,9 @@
           @Mock public void build(BTreeBuilderInfo segmentBuilderInfos) {}
         };
         long numberOfRows = 100;
    -    SegmentTaskIndex segmentTaskIndex = new SegmentTaskIndex();
    +    SegmentProperties properties = new SegmentProperties(footerList.get(0).getColumnInTable(),
    --- End diff --
    
    should be after the initialization of variable footerList.
    move to line 72


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #659: [CARBONDATA-781] Reuse SegmentProperties ob...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/659
  
    Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1738/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #659: [CARBONDATA-781] Store one SegmentPropertie...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/659
  
    Build Failed  with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1356/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #659: [CARBONDATA-781] Store one SegmentPropertie...

Posted by QiangCai <gi...@git.apache.org>.
Github user QiangCai commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/659
  
    @kumarvishal09 dump picture is driver tree.
    @watermen this pr only implement to reuse segment properties in driver side. can you try to do it in executor side?  About the building of executor side tree, please have a look AbstractQueryExecutor.initQuery and BlockIndexStore.getAll.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #659: [CARBONDATA-781] Store one SegmentPropertie...

Posted by kumarvishal09 <gi...@git.apache.org>.
Github user kumarvishal09 commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/659
  
    @watermen you can move all the schema related properties to some wrapper class and in wrapper class implement hash code and equals based on dimension column(including complex dimension) and measure column and and in segment index store you can have a static hashmap for storing the table to list of segment properties , this will solve alter table problem  


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata pull request #659: [CARBONDATA-781] Reuse SegmentProper...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/659#discussion_r112787886
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/SegmentTaskIndexStore.java ---
    @@ -396,4 +419,34 @@ public TaskBucketHolder(String taskNo, String bucketNumber) {
           return result;
         }
       }
    +
    +  public static class SegmentPropertiesWrapper {
    --- End diff --
    
    please add comment for this class


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata pull request #659: [CARBONDATA-781] Store one SegmentPr...

Posted by watermen <gi...@git.apache.org>.
Github user watermen commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/659#discussion_r108113194
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/block/SegmentTaskIndex.java ---
    @@ -16,30 +16,52 @@
      */
     package org.apache.carbondata.core.datastore.block;
     
    +import java.util.HashMap;
     import java.util.List;
    +import java.util.Map;
     
     import org.apache.carbondata.core.datastore.BTreeBuilderInfo;
     import org.apache.carbondata.core.datastore.BtreeBuilder;
     import org.apache.carbondata.core.datastore.impl.btree.BlockBTreeBuilder;
    +import org.apache.carbondata.core.metadata.AbsoluteTableIdentifier;
     import org.apache.carbondata.core.metadata.blocklet.DataFileFooter;
     
     /**
      * Class which is responsible for loading the b+ tree block. This class will
      * persist all the detail of a table segment
      */
     public class SegmentTaskIndex extends AbstractIndex {
    +  private static Map<SegmentKey, SegmentProperties> segmentPropertiesCached =
    --- End diff --
    
    I didn't find the class before, I will try to add a segmentproperites in TableSegmentUniqueIdentifier.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata pull request #659: [CARBONDATA-781] Store one SegmentPr...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/659#discussion_r110810313
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/SegmentTaskIndexStore.java ---
    @@ -338,14 +339,28 @@ private synchronized Object addAndGetSegmentLock(String segmentId) {
        * @throws IOException
        */
       private AbstractIndex loadBlocks(TaskBucketHolder taskBucketHolder,
    -      List<TableBlockInfo> tableBlockInfoList, AbsoluteTableIdentifier tableIdentifier)
    +      List<TableBlockInfo> tableBlockInfoList, AbsoluteTableIdentifier tableIdentifier,
    +      TableSegmentUniqueIdentifier tableSegmentUniqueIdentifier)
           throws IOException {
         // all the block of one task id will be loaded together
         // so creating a list which will have all the data file meta data to of one task
         List<DataFileFooter> footerList = CarbonUtil
             .readCarbonIndexFile(taskBucketHolder.taskNo, taskBucketHolder.bucketNumber,
                 tableBlockInfoList, tableIdentifier);
    -    AbstractIndex segment = new SegmentTaskIndex();
    +
    +    if (null == tableSegmentUniqueIdentifier.getSegmentProperties()) {
    +      // create a metadata details
    +      // this will be useful in query handling
    +      // all the data file metadata will have common segment properties we
    +      // can use first one to get create the segment properties
    +      SegmentProperties segmentProperties =
    +          new SegmentProperties(footerList.get(0).getColumnInTable(),
    --- End diff --
    
    I think the hashmap approach is OK. You can keep a hashmap in this class and store the mapping from a key (calculated from schema) to the SegmentProperties object.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #659: [CARBONDATA-781] Store one SegmentPropertie...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/659
  
    Build Failed  with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1351/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #659: [CARBONDATA-781] Store one SegmentPropertie...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/659
  
    Build Failed  with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1334/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #659: [CARBONDATA-781] Reuse the same SegmentProp...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/659
  
    Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1201/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #659: [CARBONDATA-781] Store one SegmentPropertie...

Posted by watermen <gi...@git.apache.org>.
Github user watermen commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/659
  
    @kumarvishal09 I agree with your idea. And I think we should remain a static HashMap, which key is <AbsoluteTableIdentifier, List<ColumnSchema>, int[] columnCardinality>(This is my first implement). But @QiangCai think maybe we will alter the SegmentProperties object later(such as column visibility), so if we reuse the SegmentProperties object across the segments, something will be wrong.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #659: [CARBONDATA-781] Reuse the same SegmentProp...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/659
  
    Earlier, this was used for handling restructure information. Now it is handled in another way, so yes we should change it to reduce the number of objects 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #659: [CARBONDATA-781] Reuse SegmentProperties ob...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/659
  
    Build Failed  with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1729/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #659: [CARBONDATA-781] Reuse the same SegmentProp...

Posted by watermen <gi...@git.apache.org>.
Github user watermen commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/659
  
    @jackylk You mean now we can store properties in table level(Maybe called TableProperties) insteads of SegmentProperties?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #659: [CARBONDATA-781] Store one SegmentPropertie...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/659
  
    Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1335/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #659: [CARBONDATA-781] Store one SegmentPropertie...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/659
  
    Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1359/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---