You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@carbondata.apache.org by ravipesala <gi...@git.apache.org> on 2018/03/08 04:53:04 UTC

[GitHub] carbondata pull request #2043: [WIP] PR-2030 sdv fix

GitHub user ravipesala opened a pull request:

    https://github.com/apache/carbondata/pull/2043

    [WIP] PR-2030 sdv fix

    Be sure to do all of the following checklist to help us incorporate 
    your contribution quickly and easily:
    
     - [ ] Any interfaces changed?
     
     - [ ] Any backward compatibility impacted?
     
     - [ ] Document update required?
    
     - [ ] Testing done
            Please provide details on 
            - Whether new unit test cases have been added or why no new tests are required?
            - How it is tested? Please attach test report.
            - Is it a performance related change? Please attach the performance test report.
            - Any additional information to help reviewers in testing this change.
           
     - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. 
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ravipesala/incubator-carbondata pr-2030-new

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/2043.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2043
    
----
commit 9086a1b9f2cd6cf1d4d42290a4e3678b01472714
Author: SangeetaGulia <sa...@...>
Date:   2017-09-21T09:26:26Z

    [CARBONDATA-1827] S3 Carbon Implementation
    
    1.Provide support for s3 in carbondata.
    2.Added S3Example to create carbon table on s3.
    3.Added S3CSVExample to load carbon table using csv from s3.
    
    This closes #1805

commit 0c75ab7359ad89a16f749e84bd42416523d5255a
Author: Jacky Li <ja...@...>
Date:   2018-01-02T15:46:14Z

    [CARBONDATA-1968] Add external table support
    
    This PR adds support for creating external table with existing carbondata files, using Hive syntax.
    CREATE EXTERNAL TABLE tableName STORED BY 'carbondata' LOCATION 'path'
    
    This closes #1749

commit 5663e916fe906675ce8efa320de1ed550315dc00
Author: Jacky Li <ja...@...>
Date:   2018-01-06T12:28:44Z

    [CARBONDATA-1992] Remove partitionId in CarbonTablePath
    
    In CarbonTablePath, there is a deprecated partition id which is always 0, it should be removed to avoid confusion.
    
    This closes #1765

commit bd40a0d73d2a7086caaa6773a2c6a1a45e24334c
Author: Jacky Li <ja...@...>
Date:   2018-01-31T16:25:31Z

    [REBASE] Solve conflict after rebasing master

commit 92c9f224094581378a681fd1f7b0cb02b923687c
Author: Jacky Li <ja...@...>
Date:   2018-01-30T13:24:04Z

    [CARBONDATA-2099] Refactor query scan process to improve readability
    
    Unified concepts in scan process flow:
    
    1.QueryModel contains all parameter for scan, it is created by API in CarbonTable. (In future, CarbonTable will be the entry point for various table operations)
    2.Use term ColumnChunk to represent one column in one blocklet, and use ChunkIndex in reader to read specified column chunk
    3.Use term ColumnPage to represent one page in one ColumnChunk
    4.QueryColumn => ProjectionColumn, indicating it is for projection
    
    This closes #1874

commit f06824e9744a831776b1203c94d4001eef870b14
Author: Jacky Li <ja...@...>
Date:   2018-01-31T08:14:27Z

    [CARBONDATA-2025] Unify all path construction through CarbonTablePath static method
    
    Refactory CarbonTablePath:
    
    1.Remove CarbonStorePath and use CarbonTablePath only.
    2.Make CarbonTablePath an utility without object creation, it can avoid creating object before using it, thus code is cleaner and GC is less.
    
    This closes #1768

commit f9d15a215adc91077f1a6ca6a456e5fce4bc05eb
Author: sounakr <so...@...>
Date:   2017-09-28T10:51:05Z

    [CARBONDATA-1480]Min Max Index Example for DataMap
    
    Datamap Example. Implementation of Min Max Index through Datamap. And Using the Index while prunning.
    
    This closes #1359

commit bb5bb00af982831ea83c73a2c437aa4aea8a5422
Author: ravipesala <ra...@...>
Date:   2017-11-15T14:18:40Z

    [CARBONDATA-1544][Datamap] Datamap FineGrain implementation
    
    Implemented interfaces for FG datamap and integrated to filterscanner to use the pruned bitset from FG datamap.
    FG Query flow as follows.
    1.The user can add FG datamap to any table and implement there interfaces.
    2. Any filter query which hits the table with datamap will call prune method of FGdatamap.
    3. The prune method of FGDatamap return list FineGrainBlocklet , these blocklets contain the information of block, blocklet, page and rowids information as well.
    4. The pruned blocklets are internally wriitten to file and returns only the block , blocklet and filepath information as part of Splits.
    5. Based on the splits scanrdd schedule the tasks.
    6. In filterscanner we check the datamapwriterpath from split and reNoteads the bitset if exists. And pass this bitset as input to it.
    
    This closes #1471

commit dfbdf3db00cbb488e49d3125b4ec93ff9e0dc9b2
Author: Jatin <ja...@...>
Date:   2018-01-25T11:23:00Z

    [CARBONDATA-2080] [S3-Implementation] Propagated hadoopConf from driver to executor for s3 implementation in cluster mode.
    
    Problem : hadoopconf was not getting propagated from driver to the executor that's why load was failing to the distributed environment.
    Solution: Setting the Hadoop conf in base class CarbonRDD
    How to verify this PR :
    Execute the load in the cluster mode It should be a success using location s3.
    
    This closes #1860

commit 586ab70228a09aca0240434327a8d4a4ff14446d
Author: xuchuanyin <xu...@...>
Date:   2018-02-08T06:35:14Z

    [CARBONDATA-2018][DataLoad] Optimization in reading/writing for sort temp row
    
    Pick up the no-sort fields in the row and pack them as bytes array and skip parsing them during merge sort to reduce CPU consumption
    
    This closes #1792

commit 3fdd5d0f567e8d07cc502202ced7d490fa85e2ad
Author: xuchuanyin <xu...@...>
Date:   2018-02-08T06:42:39Z

    [CARBONDATA-2023][DataLoad] Add size base block allocation in data loading
    
    Carbondata assign blocks to nodes at the beginning of data loading.
    Previous block allocation strategy is block number based and it will
    suffer skewed data problem if the size of input files differs a lot.
    
    We introduced a size based block allocation strategy to optimize data
    loading performance in skewed data scenario.
    
    This closes #1808

commit d88d5bb940f0fea6e5c8560fc5c8ea3724b95a28
Author: xuchuanyin <xu...@...>
Date:   2018-02-08T07:39:45Z

    [HotFix][CheckStyle] Fix import related checkstyle
    
    This closes #1952

commit 0bb4aed60a7b60eed49e0b5e618af269c0c03a73
Author: Jacky Li <ja...@...>
Date:   2018-02-08T17:39:20Z

    [REBASE] Solve conflict after rebasing master

commit 6216294c1e28c1db05e572f0aac3a991d345e085
Author: Jacky Li <ja...@...>
Date:   2018-02-27T00:51:25Z

    [REBASE] resolve conflict after rebasing to master

commit 28b5720fcf1cbd0d4bdf3f04e7b0edd8f9492a8d
Author: xuchuanyin <xu...@...>
Date:   2018-02-08T06:35:14Z

    [CARBONDATA-2018][DataLoad] Optimization in reading/writing for sort temp row
    
    Pick up the no-sort fields in the row and pack them as bytes array and skip parsing them during merge sort to reduce CPU consumption
    
    This closes #1792

commit 8fe8ab4c078de0ccd218f8ba41352896aebd5202
Author: xuchuanyin <xu...@...>
Date:   2018-02-08T06:42:39Z

    [CARBONDATA-2023][DataLoad] Add size base block allocation in data loading
    
    Carbondata assign blocks to nodes at the beginning of data loading.
    Previous block allocation strategy is block number based and it will
    suffer skewed data problem if the size of input files differs a lot.
    
    We introduced a size based block allocation strategy to optimize data
    loading performance in skewed data scenario.
    
    This closes #1808

commit 1d85e916f6a0f070960555fb18ee4cd8acbfa315
Author: Jacky Li <ja...@...>
Date:   2018-02-10T02:34:59Z

    Revert "[CARBONDATA-2023][DataLoad] Add size base block allocation in data loading"
    
    This reverts commit 6dd8b038fc898dbf48ad30adfc870c19eb38e3d0.

commit 5fccdabfc1cc4656d75e51867dcfcb250c505c91
Author: Jacky Li <ja...@...>
Date:   2018-02-10T11:44:23Z

    [CARBONDATA-1997] Add CarbonWriter SDK API
    
    Added a new module called store-sdk, and added a CarbonWriter API, it can be used to write Carbondata files to a specified folder, without Spark and Hadoop dependency. User can use this API in any environment.
    
    This closes #1967

commit 46031a320506ceed10b2134710be6c630c6ee533
Author: Jacky Li <ja...@...>
Date:   2018-02-10T12:11:25Z

    Revert "[CARBONDATA-2018][DataLoad] Optimization in reading/writing for sort temp row"
    
    This reverts commit de92ea9a123b17d903f2d1d4662299315c792954.

commit fc31be790934f8a6f59910ab0db453b139cfc7e3
Author: Jacky Li <ja...@...>
Date:   2018-02-11T02:12:10Z

    [CARBONDATA-2156] Add interface annotation
    
    InterfaceAudience and InterfaceStability annotation should be added for user and developer
    
    1.InetfaceAudience can be User and Developer
    2.InterfaceStability can be Stable, Evolving, Unstable
    
    This closes #1968

commit dcfe73b8b07267369f8c58130f27b75efccb4ee1
Author: Jacky Li <ja...@...>
Date:   2018-02-11T13:37:04Z

    [CARBONDATA-2159] Remove carbon-spark dependency in store-sdk module
    
    To make assembling JAR of store-sdk module, it should not depend on carbon-spark module
    
    This closes #1970

commit 503e0d96864173ccfb29e49686f0af3f7edd779f
Author: Jacky Li <ja...@...>
Date:   2018-02-13T01:12:09Z

    Support generating assembling JAR for store-sdk module
    
    Support generating assembling JAR for store-sdk module and remove junit dependency
    
    This closes #1976

commit faad967d8d83eabd3e758b081370235e42a3ecee
Author: xuchuanyin <xu...@...>
Date:   2018-02-13T02:58:06Z

    [CARBONDATA-2091][DataLoad] Support specifying sort column bounds in data loading
    
    Enhance data loading performance by specifying sort column bounds
    1. Add row range number during convert-process-step
    2. Dispatch rows to each sorter by range number
    3. Sort/Write process step can be done concurrently in each range
    4. Since all sorttemp files will be written in one folders, we add range
    number to the file name to distingush them
    
    Tests added and docs updated
    
    This closes #1953

commit 623a1f93bf50bbbf665d98d71fe2190a77774742
Author: Jacky Li <ja...@...>
Date:   2018-02-20T03:16:53Z

    [CARBONDATA-2186] Add InterfaceAudience.Internal to annotate internal interface
    
    This closes #1986

commit ce88eb6a2d6d54acf15a2bdf2a9165ecc9570647
Author: xuchuanyin <xu...@...>
Date:   2018-02-24T13:18:17Z

    [CARBONDATA-1114][Tests] Fix bugs in tests in windows env
    
    Fix bugs in tests that will cause failure under windows env
    
    This closes #1994

commit 8104735fd66952a531153eb0d3b4db5c9ecc133d
Author: Jacky Li <ja...@...>
Date:   2018-02-27T03:26:30Z

    [REBASE] Solve conflict after merging master

commit c7a9f15e2daa0207862aa28c44c51cc7cc081bac
Author: Ravindra Pesala <ra...@...>
Date:   2017-11-21T10:19:11Z

    [CARBONDATA-1543] Supported DataMap chooser and expression for supporting multiple datamaps in single query
    
    This PR supports 3 features.
    
    1.Load datamaps from the DataMapSchema which are created through DDL.
    2.DataMap Chooser: It chooses the datamap out of available datamaps based on simple logic. Like if there is filter condition on column1 then for supposing 2 datamaps(1. column1 2. column1+column2) are supporting this column then we choose the datamap which has fewer columns that is the first datamap.
    3.Expression support: Based on the filter expressions we convert them to the possible DataMap expressions and do apply expression on it.
    For example, there are 2 datamaps available on table1
    Datamap1 : column1
    Datamap2 : column2
    Query: select * from table1 where column1 ='a' and column2 =b
    For the above query, we create datamap expression as AndDataMapExpression(Datamap1, DataMap2). So for the above query both the datamaps are included and the output of them will be applied AND condition to improve the performance
    
    This closes #1510

commit f8ded96e659cb1e99cc6ac511d1db7cbc25dddb7
Author: Jacky Li <ja...@...>
Date:   2018-02-22T12:59:59Z

    [CARBONDATA-2189] Add DataMapProvider developer interface
    
    Add developer interface for 2 types of DataMap:
    
    1.IndexDataMap: DataMap that leveraging index to accelerate filter query
    2.MVDataMap: DataMap that leveraging Materialized View to accelerate olap style query, like SPJG query (select, predicate, join, groupby)
    This PR adds support for following logic when creating and dropping the DataMap
    
    This closes #1987

commit 1134431dcfa22fdfc6ff0e9d26688009cf940326
Author: Jacky Li <ja...@...>
Date:   2018-02-26T02:04:51Z

    [HOTFIX] Add dava doc for datamap interface
    
    1. Rename some of the datamap interface
    2. Add more java doc for all public class of datamap interface
    
    This closes #1998

commit 96ee82b336b178c7a0d5a6d8aa8fd14afc7bc27e
Author: Jacky Li <ja...@...>
Date:   2018-02-27T05:06:02Z

    [HOTFIX] Fix timestamp issue in TestSortColumnsWithUnsafe
    
    Fix timestamp issue in TestSortColumnsWithUnsafe
    
    This closes #2001

----


---

[GitHub] carbondata issue #2043: [WIP] PR-2030 sdv fix

Posted by CarbonDataQA <gi...@git.apache.org>.

Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2043
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/4130/



---

[GitHub] carbondata pull request #2043: [WIP] PR-2030 sdv fix

Posted by ravipesala <gi...@git.apache.org>.

Github user ravipesala closed the pull request at:

    https://github.com/apache/carbondata/pull/2043


---

[GitHub] carbondata issue #2043: [WIP] PR-2030 sdv fix

Posted by ravipesala <gi...@git.apache.org>.

Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2043
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/3795/



---

[GitHub] carbondata issue #2043: [WIP] PR-2030 sdv fix

Posted by CarbonDataQA <gi...@git.apache.org>.

Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2043
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/2891/



---

[GitHub] carbondata issue #2043: [WIP] PR-2030 sdv fix

Posted by ravipesala <gi...@git.apache.org>.

Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2043
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/3799/



---

[GitHub] carbondata issue #2043: [WIP] PR-2030 sdv fix

Posted by ravipesala <gi...@git.apache.org>.

Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2043
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/3796/



---

[GitHub] carbondata issue #2043: [WIP] PR-2030 sdv fix

Posted by ravipesala <gi...@git.apache.org>.

Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2043
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/3800/



---

[GitHub] carbondata issue #2043: PR-2030 sdv fix (Merging datamap branch into master)

Posted by ravipesala <gi...@git.apache.org>.

Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2043
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/3801/



---

[GitHub] carbondata issue #2043: [WIP] PR-2030 sdv fix

Posted by ravipesala <gi...@git.apache.org>.

Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2043
  
    retest sdv please


---

[GitHub] carbondata issue #2043: [WIP] PR-2030 sdv fix

Posted by ravipesala <gi...@git.apache.org>.

Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2043
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/3788/



---

[GitHub] carbondata issue #2043: [WIP] PR-2030 sdv fix

Posted by ravipesala <gi...@git.apache.org>.

Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2043
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/3790/



---

[GitHub] carbondata issue #2043: [WIP] PR-2030 sdv fix

Posted by ravipesala <gi...@git.apache.org>.

Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2043
  
    retest this please


---

[GitHub] carbondata issue #2043: [WIP] PR-2030 sdv fix

Posted by ravipesala <gi...@git.apache.org>.

Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2043
  
    retest sdv please


---

[GitHub] carbondata issue #2043: [WIP] PR-2030 sdv fix

Posted by ravipesala <gi...@git.apache.org>.

Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2043
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/3798/



---

[GitHub] carbondata issue #2043: [WIP] PR-2030 sdv fix

Posted by CarbonDataQA <gi...@git.apache.org>.

Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2043
  
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/2885/



---

[GitHub] carbondata issue #2043: PR-2030 sdv fix (Merging datamap branch into master)

Posted by CarbonDataQA <gi...@git.apache.org>.

Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2043
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/4139/



---

[GitHub] carbondata issue #2043: [WIP] PR-2030 sdv fix

Posted by ravipesala <gi...@git.apache.org>.

Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2043
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/3794/



---

[GitHub] carbondata issue #2043: PR-2030 sdv fix (Merging datamap branch into master)

Posted by ravipesala <gi...@git.apache.org>.

Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2043
  
    retest this please


---

[GitHub] carbondata issue #2043: [WIP] PR-2030 sdv fix

Posted by CarbonDataQA <gi...@git.apache.org>.

Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2043
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/4136/



---

[GitHub] carbondata pull request #2043: [WIP] PR-2030 sdv fix

Posted by ravipesala <gi...@git.apache.org>.

GitHub user ravipesala reopened a pull request:

    https://github.com/apache/carbondata/pull/2043

    [WIP] PR-2030 sdv fix

    Be sure to do all of the following checklist to help us incorporate 
    your contribution quickly and easily:
    
     - [ ] Any interfaces changed?
     
     - [ ] Any backward compatibility impacted?
     
     - [ ] Document update required?
    
     - [ ] Testing done
            Please provide details on 
            - Whether new unit test cases have been added or why no new tests are required?
            - How it is tested? Please attach test report.
            - Is it a performance related change? Please attach the performance test report.
            - Any additional information to help reviewers in testing this change.
           
     - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. 
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ravipesala/incubator-carbondata pr-2030-new

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/2043.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2043
    
----
commit 7dbac45223555b30de42aa5304397e535fd074fa
Author: SangeetaGulia <sa...@...>
Date:   2017-09-21T09:26:26Z

    [CARBONDATA-1827] S3 Carbon Implementation
    
    1.Provide support for s3 in carbondata.
    2.Added S3Example to create carbon table on s3.
    3.Added S3CSVExample to load carbon table using csv from s3.
    
    This closes #1805

commit fd5d3f9a208934ab555bf4d180ac899f7437b40f
Author: sounakr <so...@...>
Date:   2017-09-28T10:51:05Z

    [CARBONDATA-1480]Min Max Index Example for DataMap
    
    Datamap Example. Implementation of Min Max Index through Datamap. And Using the Index while prunning.
    
    This closes #1359

commit ea31f6c0639752ca8fe298a01e439f6b36a9f471
Author: ravipesala <ra...@...>
Date:   2017-11-15T14:18:40Z

    [CARBONDATA-1544][Datamap] Datamap FineGrain implementation
    
    Implemented interfaces for FG datamap and integrated to filterscanner to use the pruned bitset from FG datamap.
    FG Query flow as follows.
    1.The user can add FG datamap to any table and implement there interfaces.
    2. Any filter query which hits the table with datamap will call prune method of FGdatamap.
    3. The prune method of FGDatamap return list FineGrainBlocklet , these blocklets contain the information of block, blocklet, page and rowids information as well.
    4. The pruned blocklets are internally wriitten to file and returns only the block , blocklet and filepath information as part of Splits.
    5. Based on the splits scanrdd schedule the tasks.
    6. In filterscanner we check the datamapwriterpath from split and reNoteads the bitset if exists. And pass this bitset as input to it.
    
    This closes #1471

commit b752b6dde963dd5dc0b12cbac1f02cf513168f67
Author: Jacky Li <ja...@...>
Date:   2018-01-02T15:46:14Z

    [CARBONDATA-1968] Add external table support
    
    This PR adds support for creating external table with existing carbondata files, using Hive syntax.
    CREATE EXTERNAL TABLE tableName STORED BY 'carbondata' LOCATION 'path'
    
    This closes #1749

commit 5bedd77b0128e69f0853ce779de40e0105f7c801
Author: Jacky Li <ja...@...>
Date:   2018-01-06T12:28:44Z

    [CARBONDATA-1992] Remove partitionId in CarbonTablePath
    
    In CarbonTablePath, there is a deprecated partition id which is always 0, it should be removed to avoid confusion.
    
    This closes #1765

commit d0e3df7d9cfee4976b0f2acc45ddaaf825ea78a7
Author: Jatin <ja...@...>
Date:   2018-01-25T11:23:00Z

    [CARBONDATA-2080] [S3-Implementation] Propagated hadoopConf from driver to executor for s3 implementation in cluster mode.
    
    Problem : hadoopconf was not getting propagated from driver to the executor that's why load was failing to the distributed environment.
    Solution: Setting the Hadoop conf in base class CarbonRDD
    How to verify this PR :
    Execute the load in the cluster mode It should be a success using location s3.
    
    This closes #1860

commit bceb12175dfb5d62b9f0a1178e0daa3f40f75577
Author: Jacky Li <ja...@...>
Date:   2018-01-30T13:24:04Z

    [CARBONDATA-2099] Refactor query scan process to improve readability
    
    Unified concepts in scan process flow:
    
    1.QueryModel contains all parameter for scan, it is created by API in CarbonTable. (In future, CarbonTable will be the entry point for various table operations)
    2.Use term ColumnChunk to represent one column in one blocklet, and use ChunkIndex in reader to read specified column chunk
    3.Use term ColumnPage to represent one page in one ColumnChunk
    4.QueryColumn => ProjectionColumn, indicating it is for projection
    
    This closes #1874

commit 40872e618c268d16b6a62ab046b36f9b0c937b6d
Author: Jacky Li <ja...@...>
Date:   2018-01-31T08:14:27Z

    [CARBONDATA-2025] Unify all path construction through CarbonTablePath static method
    
    Refactory CarbonTablePath:
    
    1.Remove CarbonStorePath and use CarbonTablePath only.
    2.Make CarbonTablePath an utility without object creation, it can avoid creating object before using it, thus code is cleaner and GC is less.
    
    This closes #1768

commit 47872e8fa10ec37c104c8eaacf6c889d5563f23a
Author: Jacky Li <ja...@...>
Date:   2018-01-31T16:25:31Z

    [REBASE] Solve conflict after rebasing master

commit 7791ed69c8e3bd03ee6ee7b3831f7e02c5254cb4
Author: xuchuanyin <xu...@...>
Date:   2018-02-08T06:35:14Z

    [CARBONDATA-2018][DataLoad] Optimization in reading/writing for sort temp row
    
    Pick up the no-sort fields in the row and pack them as bytes array and skip parsing them during merge sort to reduce CPU consumption
    
    This closes #1792

commit 51592c17aaa82a58a91e79085c153e1ac9abacb0
Author: xuchuanyin <xu...@...>
Date:   2018-02-08T06:42:39Z

    [CARBONDATA-2023][DataLoad] Add size base block allocation in data loading
    
    Carbondata assign blocks to nodes at the beginning of data loading.
    Previous block allocation strategy is block number based and it will
    suffer skewed data problem if the size of input files differs a lot.
    
    We introduced a size based block allocation strategy to optimize data
    loading performance in skewed data scenario.
    
    This closes #1808

commit 0fc046508e50f87a37c932abac64a522d5acb143
Author: xuchuanyin <xu...@...>
Date:   2018-02-08T07:39:45Z

    [HotFix][CheckStyle] Fix import related checkstyle
    
    This closes #1952

commit a40ccda04629908b486c906fca93fc3be34fb1e8
Author: Jacky Li <ja...@...>
Date:   2018-02-08T17:39:20Z

    [REBASE] Solve conflict after rebasing master

commit 21fda7795d55e4aa05ffde84598f0c8f0b030bea
Author: Jacky Li <ja...@...>
Date:   2018-02-10T02:34:59Z

    Revert "[CARBONDATA-2023][DataLoad] Add size base block allocation in data loading"
    
    This reverts commit 6dd8b038fc898dbf48ad30adfc870c19eb38e3d0.

commit d5f9532a8923d56caf594d04b46f8279ddc551f5
Author: Jacky Li <ja...@...>
Date:   2018-02-10T11:44:23Z

    [CARBONDATA-1997] Add CarbonWriter SDK API
    
    Added a new module called store-sdk, and added a CarbonWriter API, it can be used to write Carbondata files to a specified folder, without Spark and Hadoop dependency. User can use this API in any environment.
    
    This closes #1967

commit 3832d5e8aab151325c3227cfc9361823080ffdb9
Author: Jacky Li <ja...@...>
Date:   2018-02-10T12:11:25Z

    Revert "[CARBONDATA-2018][DataLoad] Optimization in reading/writing for sort temp row"
    
    This reverts commit de92ea9a123b17d903f2d1d4662299315c792954.

commit 1d7b4e09606c86cb1d66cbd1949f7e746dbb39e1
Author: Jacky Li <ja...@...>
Date:   2018-02-11T02:12:10Z

    [CARBONDATA-2156] Add interface annotation
    
    InterfaceAudience and InterfaceStability annotation should be added for user and developer
    
    1.InetfaceAudience can be User and Developer
    2.InterfaceStability can be Stable, Evolving, Unstable
    
    This closes #1968

commit 9db10ca3ee404b10550eb8240f5f79415b01f192
Author: Jacky Li <ja...@...>
Date:   2018-02-27T00:51:25Z

    [REBASE] resolve conflict after rebasing to master

commit f09af11d815f54690082d3defffcbb797606881f
Author: Ravindra Pesala <ra...@...>
Date:   2017-11-21T10:19:11Z

    [CARBONDATA-1543] Supported DataMap chooser and expression for supporting multiple datamaps in single query
    
    This PR supports 3 features.
    
    1.Load datamaps from the DataMapSchema which are created through DDL.
    2.DataMap Chooser: It chooses the datamap out of available datamaps based on simple logic. Like if there is filter condition on column1 then for supposing 2 datamaps(1. column1 2. column1+column2) are supporting this column then we choose the datamap which has fewer columns that is the first datamap.
    3.Expression support: Based on the filter expressions we convert them to the possible DataMap expressions and do apply expression on it.
    For example, there are 2 datamaps available on table1
    Datamap1 : column1
    Datamap2 : column2
    Query: select * from table1 where column1 ='a' and column2 =b
    For the above query, we create datamap expression as AndDataMapExpression(Datamap1, DataMap2). So for the above query both the datamaps are included and the output of them will be applied AND condition to improve the performance
    
    This closes #1510

commit 07d0e690b9b1d70cf539775b902ca2baf6c75be0
Author: xuchuanyin <xu...@...>
Date:   2018-02-08T06:35:14Z

    [CARBONDATA-2018][DataLoad] Optimization in reading/writing for sort temp row
    
    Pick up the no-sort fields in the row and pack them as bytes array and skip parsing them during merge sort to reduce CPU consumption
    
    This closes #1792

commit 6dcb97f60e7abdf9980232592634b393f05b328e
Author: xuchuanyin <xu...@...>
Date:   2018-02-08T06:42:39Z

    [CARBONDATA-2023][DataLoad] Add size base block allocation in data loading
    
    Carbondata assign blocks to nodes at the beginning of data loading.
    Previous block allocation strategy is block number based and it will
    suffer skewed data problem if the size of input files differs a lot.
    
    We introduced a size based block allocation strategy to optimize data
    loading performance in skewed data scenario.
    
    This closes #1808

commit b11ceaff0caed97d65fa67eb3df2e2d54230095d
Author: Jacky Li <ja...@...>
Date:   2018-02-11T13:37:04Z

    [CARBONDATA-2159] Remove carbon-spark dependency in store-sdk module
    
    To make assembling JAR of store-sdk module, it should not depend on carbon-spark module
    
    This closes #1970

commit c0f463fe273111ef037cadcfcf71e26e298e3848
Author: Jacky Li <ja...@...>
Date:   2018-02-13T01:12:09Z

    Support generating assembling JAR for store-sdk module
    
    Support generating assembling JAR for store-sdk module and remove junit dependency
    
    This closes #1976

commit babf28ecff64174ae5ab9636f0a461f3e0998a77
Author: xuchuanyin <xu...@...>
Date:   2018-02-13T02:58:06Z

    [CARBONDATA-2091][DataLoad] Support specifying sort column bounds in data loading
    
    Enhance data loading performance by specifying sort column bounds
    1. Add row range number during convert-process-step
    2. Dispatch rows to each sorter by range number
    3. Sort/Write process step can be done concurrently in each range
    4. Since all sorttemp files will be written in one folders, we add range
    number to the file name to distingush them
    
    Tests added and docs updated
    
    This closes #1953

commit 248302c2ec8244e03b4e035bed11fa5a38992f25
Author: Jacky Li <ja...@...>
Date:   2018-02-20T03:16:53Z

    [CARBONDATA-2186] Add InterfaceAudience.Internal to annotate internal interface
    
    This closes #1986

commit 50e55987665821b568b3c495285c023063127246
Author: Jacky Li <ja...@...>
Date:   2018-02-22T12:59:59Z

    [CARBONDATA-2189] Add DataMapProvider developer interface
    
    Add developer interface for 2 types of DataMap:
    
    1.IndexDataMap: DataMap that leveraging index to accelerate filter query
    2.MVDataMap: DataMap that leveraging Materialized View to accelerate olap style query, like SPJG query (select, predicate, join, groupby)
    This PR adds support for following logic when creating and dropping the DataMap
    
    This closes #1987

commit 7f4b84802ddb11f6508b4ab47d6ab3c55cf79107
Author: xuchuanyin <xu...@...>
Date:   2018-02-24T13:18:17Z

    [CARBONDATA-1114][Tests] Fix bugs in tests in windows env
    
    Fix bugs in tests that will cause failure under windows env
    
    This closes #1994

commit 18ee863a9ca3cf1d65d69ad6216ecec88c182f5b
Author: Jacky Li <ja...@...>
Date:   2018-02-26T02:04:51Z

    [HOTFIX] Add dava doc for datamap interface
    
    1. Rename some of the datamap interface
    2. Add more java doc for all public class of datamap interface
    
    This closes #1998

commit 3e0d893d36ad11b9e61ab0eb6271546ca7c63346
Author: Jacky Li <ja...@...>
Date:   2018-02-26T08:30:38Z

    [CARBONDATA-2206] support lucene index datamap
    
    This PR is an initial effort to integrate lucene as an index datamap into carbondata.
    A new module called carbondata-lucene is added to support lucene datamap:
    
    1.Add LuceneFineGrainDataMap, implement FineGrainDataMap interface.
    2.Add LuceneCoarseGrainDataMap, implement CoarseGrainDataMap interface.
    3.Support writing lucene index via LuceneDataMapWriter.
    4.Implement LuceneDataMapFactory
    5.A UDF called TEXT_MATCH is added, use it to do filtering on string column by lucene
    
    This closes #2003

commit 8c0d96c43ebb3a0553d3283201d6d3a226f6ebba
Author: Jacky Li <ja...@...>
Date:   2018-02-27T03:26:30Z

    [REBASE] Solve conflict after merging master

----


---

[GitHub] carbondata issue #2043: [WIP] PR-2030 sdv fix

Posted by ravipesala <gi...@git.apache.org>.

Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2043
  
    retest this please


---

[GitHub] carbondata issue #2043: [WIP] PR-2030 sdv fix

Posted by ravipesala <gi...@git.apache.org>.

Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2043
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/3797/



---

[GitHub] carbondata issue #2043: [WIP] PR-2030 sdv fix

Posted by ravipesala <gi...@git.apache.org>.

Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2043
  
    retest sdv please


---

[GitHub] carbondata issue #2043: [WIP] PR-2030 sdv fix

Posted by ravipesala <gi...@git.apache.org>.

Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2043
  
    retest sdv please


---

[GitHub] carbondata issue #2043: [WIP] PR-2030 sdv fix

Posted by ravipesala <gi...@git.apache.org>.

Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2043
  
    retest sdv please


---

[GitHub] carbondata pull request #2043: PR-2030 sdv fix (Merging datamap branch into ...

Posted by ravipesala <gi...@git.apache.org>.

Github user ravipesala closed the pull request at:

    https://github.com/apache/carbondata/pull/2043


---

[GitHub] carbondata issue #2043: [WIP] PR-2030 sdv fix

Posted by ravipesala <gi...@git.apache.org>.

Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2043
  
    retest sdv please


---

[GitHub] carbondata issue #2043: [WIP] PR-2030 sdv fix

Posted by CarbonDataQA <gi...@git.apache.org>.

Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2043
  
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/2893/



---

[GitHub] carbondata issue #2043: PR-2030 sdv fix (Merging datamap branch into master)

Posted by CarbonDataQA <gi...@git.apache.org>.

Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2043
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/2894/



---

[GitHub] carbondata issue #2043: [WIP] PR-2030 sdv fix

Posted by ravipesala <gi...@git.apache.org>.

Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2043
  
    retest sdv please


---

[GitHub] carbondata issue #2043: [WIP] PR-2030 sdv fix

Posted by CarbonDataQA <gi...@git.apache.org>.

Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2043
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/2882/



---

[GitHub] carbondata issue #2043: [WIP] PR-2030 sdv fix

Posted by ravipesala <gi...@git.apache.org>.

Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2043
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/3787/



---

[GitHub] carbondata issue #2043: [WIP] PR-2030 sdv fix

Posted by ravipesala <gi...@git.apache.org>.

Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2043
  
    retest sdv please


---

[GitHub] carbondata issue #2043: [WIP] PR-2030 sdv fix

Posted by CarbonDataQA <gi...@git.apache.org>.

Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2043
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/4127/



---

[GitHub] carbondata issue #2043: [WIP] PR-2030 sdv fix

Posted by CarbonDataQA <gi...@git.apache.org>.

Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2043
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/4138/



---