You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@carbondata.apache.org by ravipesala <gi...@git.apache.org> on 2017/12/16 17:17:47 UTC

[GitHub] carbondata pull request #1674: [CARBONDATA-1859][CARBONDATA-1861] Support sh...

GitHub user ravipesala opened a pull request:

    https://github.com/apache/carbondata/pull/1674

    [CARBONDATA-1859][CARBONDATA-1861] Support show and drop partitions

    This PR depends on https://github.com/apache/carbondata/pull/1642 and https://github.com/apache/carbondata/pull/1654 and https://github.com/apache/carbondata/pull/1672
    
    It supports show and drop partitions.
    
    Be sure to do all of the following checklist to help us incorporate 
    your contribution quickly and easily:
    
     - [X] Any interfaces changed?
     - [X] Any backward compatibility impacted?
       NO
     - [X] Document update required?
       Yes
     - [X] Testing done
            Tests added
           
     - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. 
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ravipesala/incubator-carbondata drop-partition

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/1674.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1674
    
----
commit 6b49e6c4d9e5a960a657097487e94a83012fe491
Author: ravipesala <ra...@gmail.com>
Date:   2017-12-04T10:37:03Z

    Added outputformat for carbon

commit d8a88314b0a19b8e30a3b1ad1a3540f7f0964c7c
Author: ravipesala <ra...@gmail.com>
Date:   2017-12-12T06:12:45Z

    Added fileformat in carbon

commit 67c4f83c5b62f03dffb2fe8a8aa53923a889d411
Author: ravipesala <ra...@gmail.com>
Date:   2017-12-15T19:18:19Z

    Added support to query using standard partitions

commit a127ece350c1a9b1a1c7755cb51e4e1f9342cac3
Author: ravipesala <ra...@gmail.com>
Date:   2017-12-16T17:08:00Z

    Added drop partition feature

----


---

[GitHub] carbondata pull request #1674: [CARBONDATA-1859][CARBONDATA-1861][PARTITION]...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1674#discussion_r157742911
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMapModel.java ---
    @@ -16,18 +16,35 @@
      */
     package org.apache.carbondata.core.indexstore.blockletindex;
     
    +import java.util.List;
    +
     import org.apache.carbondata.core.datamap.dev.DataMapModel;
     
     public class BlockletDataMapModel extends DataMapModel {
    --- End diff --
    
    ok


---

[GitHub] carbondata issue #1674: [CARBONDATA-1859][CARBONDATA-1861][PARTITION] Suppor...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1674
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2397/



---

[GitHub] carbondata pull request #1674: [CARBONDATA-1859][CARBONDATA-1861][PARTITION]...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1674#discussion_r158055612
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/metadata/PartitionMapFileStore.java ---
    @@ -176,13 +179,87 @@ public PartitionMapper readPartitionMap(String partitionMapPath) {
       public void readAllPartitionsOfSegment(String segmentPath) {
         CarbonFile[] partitionFiles = getPartitionFiles(segmentPath);
         if (partitionFiles != null && partitionFiles.length > 0) {
    +      partionedSegment = true;
           for (CarbonFile file : partitionFiles) {
             PartitionMapper partitionMapper = readPartitionMap(file.getAbsolutePath());
             partitionMap.putAll(partitionMapper.getPartitionMap());
           }
         }
       }
     
    +  public boolean isPartionedSegment() {
    +    return partionedSegment;
    +  }
    +
    +  /**
    +   * Drops the partitions from the partition mapper file of the segment and writes to a new file.
    +   * @param segmentPath
    +   * @param partitionsToDrop
    +   * @param uniqueId
    +   * @throws IOException
    +   */
    +  public void dropPartitions(String segmentPath, List<String> partitionsToDrop, String uniqueId)
    --- End diff --
    
    1.For clean files jira CARBONDATA-1863 is already created as part of a design.
    2. Now the partitionmap is created using the loadtime stamp. And it will not delete while dropping the partition. But while reading we make sure that we read the latest file using the latest timestamp. And also load status is updated with new timestamp so all the cache will be cleared from the driver.
    



---

[GitHub] carbondata pull request #1674: [CARBONDATA-1859][CARBONDATA-1861][PARTITION]...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1674#discussion_r157675550
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/datamap/dev/DataMap.java ---
    @@ -42,6 +42,17 @@
        */
       List<Blocklet> prune(FilterResolverIntf filterExp);
     
    +  // TODO Move this method to Abstract class
    --- End diff --
    
    This TODO is not valid now, right?


---

[GitHub] carbondata pull request #1674: [CARBONDATA-1859][CARBONDATA-1861][PARTITION]...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1674#discussion_r157741759
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMap.java ---
    @@ -379,10 +397,21 @@ private void createSchema(SegmentProperties segmentProperties) throws MemoryExce
             new UnsafeMemoryDMStore(indexSchemas.toArray(new CarbonRowSchema[indexSchemas.size()]));
       }
     
    -  private void createTaskMinMaxSchema(SegmentProperties segmentProperties) throws MemoryException {
    +  private void createSummarySchema(SegmentProperties segmentProperties, List<String> partitions)
    --- End diff --
    
    ok


---

[GitHub] carbondata issue #1674: [CARBONDATA-1859][CARBONDATA-1861][PARTITION] Suppor...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1674
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2419/



---

[GitHub] carbondata pull request #1674: [CARBONDATA-1859][CARBONDATA-1861][PARTITION]...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1674#discussion_r157677564
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMap.java ---
    @@ -491,6 +520,23 @@ public boolean isScanRequired(FilterResolverIntf filterExp) {
         return blocklets;
       }
     
    +  @Override public List<Blocklet> prune(FilterResolverIntf filterExp, List<String> partitions) {
    +    List<String> storedPartitions = getPartitions();
    --- End diff --
    
    provide comments for the logic inside this function


---

[GitHub] carbondata pull request #1674: [CARBONDATA-1859][CARBONDATA-1861][PARTITION]...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1674#discussion_r157675898
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/datamap/dev/DataMap.java ---
    @@ -42,6 +42,17 @@
        */
       List<Blocklet> prune(FilterResolverIntf filterExp);
     
    +  // TODO Move this method to Abstract class
    +  /**
    +   * Prune the datamap with filter expression and partition information. It returns the list of
    +   * blocklets where these filters can exist.
    +   *
    +   * @param filterExp
    +   * @return
    +   */
    +  List<Blocklet> prune(FilterResolverIntf filterExp, List<String> partitions);
    +
    +  // TODO Move this method to Abstract class
    --- End diff --
    
    In FG implementation I already created Abstract classes , so when we merge it we can move there


---

[GitHub] carbondata pull request #1674: [CARBONDATA-1859][CARBONDATA-1861][PARTITION]...

Posted by gvramana <gi...@git.apache.org>.
Github user gvramana commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1674#discussion_r158095586
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/metadata/PartitionMapFileStore.java ---
    @@ -176,13 +179,87 @@ public PartitionMapper readPartitionMap(String partitionMapPath) {
       public void readAllPartitionsOfSegment(String segmentPath) {
         CarbonFile[] partitionFiles = getPartitionFiles(segmentPath);
         if (partitionFiles != null && partitionFiles.length > 0) {
    +      partionedSegment = true;
           for (CarbonFile file : partitionFiles) {
             PartitionMapper partitionMapper = readPartitionMap(file.getAbsolutePath());
             partitionMap.putAll(partitionMapper.getPartitionMap());
           }
         }
       }
     
    +  public boolean isPartionedSegment() {
    +    return partionedSegment;
    +  }
    +
    +  /**
    +   * Drops the partitions from the partition mapper file of the segment and writes to a new file.
    +   * @param segmentPath
    +   * @param partitionsToDrop
    +   * @param uniqueId
    +   * @throws IOException
    +   */
    +  public void dropPartitions(String segmentPath, List<String> partitionsToDrop, String uniqueId)
    --- End diff --
    
    Latest file reading always can make query to read intermediate state, so Query should always should stick to previously committed timestamp. 
    It is better to be handled in separate issue "support committed timestamp based read interface to data maps making them to read only committed content". 


---

[GitHub] carbondata pull request #1674: [CARBONDATA-1859][CARBONDATA-1861][PARTITION]...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1674#discussion_r157675931
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/datamap/dev/DataMap.java ---
    @@ -42,6 +42,17 @@
        */
       List<Blocklet> prune(FilterResolverIntf filterExp);
     
    +  // TODO Move this method to Abstract class
    --- End diff --
    
    In FG implementation I already created Abstract classes , so when we merge it we can move there


---

[GitHub] carbondata issue #1674: [CARBONDATA-1859][CARBONDATA-1861][PARTITION] Suppor...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1674
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2066/



---

[GitHub] carbondata issue #1674: [CARBONDATA-1859][CARBONDATA-1861][PARTITION] Suppor...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1674
  
    Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/892/



---

[GitHub] carbondata issue #1674: [CARBONDATA-1859][CARBONDATA-1861][PARTITION] Suppor...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1674
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2369/



---

[GitHub] carbondata issue #1674: [CARBONDATA-1859][CARBONDATA-1861][PARTITION] Suppor...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1674
  
    retest this please


---

[GitHub] carbondata pull request #1674: [CARBONDATA-1859][CARBONDATA-1861][PARTITION]...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1674#discussion_r157675599
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/datamap/dev/DataMap.java ---
    @@ -42,6 +42,17 @@
        */
       List<Blocklet> prune(FilterResolverIntf filterExp);
     
    +  // TODO Move this method to Abstract class
    +  /**
    +   * Prune the datamap with filter expression and partition information. It returns the list of
    +   * blocklets where these filters can exist.
    +   *
    +   * @param filterExp
    +   * @return
    +   */
    +  List<Blocklet> prune(FilterResolverIntf filterExp, List<String> partitions);
    +
    +  // TODO Move this method to Abstract class
    --- End diff --
    
    This TODO is not valid now, right?


---

[GitHub] carbondata issue #1674: [CARBONDATA-1859][CARBONDATA-1861][PARTITION] Suppor...

Posted by gvramana <gi...@git.apache.org>.
Github user gvramana commented on the issue:

    https://github.com/apache/carbondata/pull/1674
  
    LGTM


---

[GitHub] carbondata issue #1674: [CARBONDATA-1859][CARBONDATA-1861][PARTITION] Suppor...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1674
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2118/



---

[GitHub] carbondata pull request #1674: [CARBONDATA-1859][CARBONDATA-1861][PARTITION]...

Posted by gvramana <gi...@git.apache.org>.
Github user gvramana commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1674#discussion_r157829858
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/metadata/PartitionMapFileStore.java ---
    @@ -176,13 +179,87 @@ public PartitionMapper readPartitionMap(String partitionMapPath) {
       public void readAllPartitionsOfSegment(String segmentPath) {
         CarbonFile[] partitionFiles = getPartitionFiles(segmentPath);
         if (partitionFiles != null && partitionFiles.length > 0) {
    +      partionedSegment = true;
           for (CarbonFile file : partitionFiles) {
             PartitionMapper partitionMapper = readPartitionMap(file.getAbsolutePath());
             partitionMap.putAll(partitionMapper.getPartitionMap());
           }
         }
       }
     
    +  public boolean isPartionedSegment() {
    +    return partionedSegment;
    +  }
    +
    +  /**
    +   * Drops the partitions from the partition mapper file of the segment and writes to a new file.
    +   * @param segmentPath
    +   * @param partitionsToDrop
    +   * @param uniqueId
    +   * @throws IOException
    +   */
    +  public void dropPartitions(String segmentPath, List<String> partitionsToDrop, String uniqueId)
    +      throws IOException {
    +    readAllPartitionsOfSegment(segmentPath);
    +    List<String> indexesToDrop = new ArrayList<>();
    +    for (Map.Entry<String, List<String>> entry: partitionMap.entrySet()) {
    +      for (String partition: partitionsToDrop) {
    +        if (entry.getValue().contains(partition)) {
    +          indexesToDrop.add(entry.getKey());
    +        }
    +      }
    +    }
    +    if (indexesToDrop.size() > 0) {
    +      // Remove the indexes from partition map
    +      for (String indexToDrop : indexesToDrop) {
    +        partitionMap.remove(indexToDrop);
    +      }
    +      PartitionMapper mapper = new PartitionMapper();
    +      mapper.setPartitionMap(partitionMap);
    +      String path = segmentPath + "/" + uniqueId + CarbonTablePath.PARTITION_MAP_EXT;
    +      writePartitionFile(mapper, path);
    +    }
    +  }
    +
    +  /**
    +   * It deletes the old partition mapper files in case of success. And in case of failure it removes
    +   * the old new file.
    +   * @param segmentPath
    +   * @param uniqueId
    +   * @param success
    +   */
    +  public void commitPartitions(String segmentPath, final String uniqueId, boolean success) {
    +    CarbonFile carbonFile = FileFactory.getCarbonFile(segmentPath);
    --- End diff --
    
     Locks are not taken while dropping partitions in Hive metastore, don't we require to take it?


---

[GitHub] carbondata issue #1674: [CARBONDATA-1859][CARBONDATA-1861][PARTITION] Suppor...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1674
  
    Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/917/



---

[GitHub] carbondata pull request #1674: [CARBONDATA-1859][CARBONDATA-1861][PARTITION]...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1674#discussion_r157740999
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/indexstore/BlockletDataMapIndexStore.java ---
    @@ -71,12 +72,14 @@ public BlockletDataMap get(TableBlockIndexUniqueIdentifier identifier)
         BlockletDataMap dataMap = (BlockletDataMap) lruCache.get(lruCacheKey);
         if (dataMap == null) {
           try {
    +        String segmentPath = CarbonTablePath
    --- End diff --
    
    ok


---

[GitHub] carbondata pull request #1674: [CARBONDATA-1859][CARBONDATA-1861][PARTITION]...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1674#discussion_r157678533
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMapModel.java ---
    @@ -16,18 +16,35 @@
      */
     package org.apache.carbondata.core.indexstore.blockletindex;
     
    +import java.util.List;
    +
     import org.apache.carbondata.core.datamap.dev.DataMapModel;
     
     public class BlockletDataMapModel extends DataMapModel {
    --- End diff --
    
    Please provide comment for this class


---

[GitHub] carbondata issue #1674: [CARBONDATA-1859][CARBONDATA-1861][PARTITION] Suppor...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1674
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2371/



---

[GitHub] carbondata issue #1674: [CARBONDATA-1859][CARBONDATA-1861] Support show and ...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1674
  
    Build Failed with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/833/



---

[GitHub] carbondata pull request #1674: [CARBONDATA-1859][CARBONDATA-1861][PARTITION]...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1674#discussion_r157741064
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/indexstore/BlockletDataMapIndexStore.java ---
    @@ -102,18 +105,26 @@ public BlockletDataMap get(TableBlockIndexUniqueIdentifier identifier)
           }
           if (missedIdentifiers.size() > 0) {
             Map<String, SegmentIndexFileStore> segmentIndexFileStoreMap = new HashMap<>();
    +        Map<String, PartitionMapFileStore> partitionFileStoreMap = new HashMap<>();
             for (TableBlockIndexUniqueIdentifier identifier: missedIdentifiers) {
               SegmentIndexFileStore indexFileStore =
                   segmentIndexFileStoreMap.get(identifier.getSegmentId());
    +          PartitionMapFileStore partitionFileStore =
    +              partitionFileStoreMap.get(identifier.getSegmentId());
    +          String segmentPath = CarbonTablePath
    --- End diff --
    
    ok


---

[GitHub] carbondata pull request #1674: [CARBONDATA-1859][CARBONDATA-1861][PARTITION]...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1674#discussion_r157675696
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/indexstore/BlockletDataMapIndexStore.java ---
    @@ -71,12 +72,14 @@ public BlockletDataMap get(TableBlockIndexUniqueIdentifier identifier)
         BlockletDataMap dataMap = (BlockletDataMap) lruCache.get(lruCacheKey);
         if (dataMap == null) {
           try {
    +        String segmentPath = CarbonTablePath
    --- End diff --
    
    change like:
    ```
    String segmentPath = CarbonTablePath.getSegmentPath(
         identifier.getAbsoluteTableIdentifier().getTablePath(),
         identifier.getSegmentId());
    ```


---

[GitHub] carbondata issue #1674: [CARBONDATA-1859][CARBONDATA-1861][PARTITION] Suppor...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1674
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2149/



---

[GitHub] carbondata issue #1674: [CARBONDATA-1859][CARBONDATA-1861][PARTITION] Suppor...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1674
  
    retest this please


---

[GitHub] carbondata issue #1674: [CARBONDATA-1859][CARBONDATA-1861][PARTITION] Suppor...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1674
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2460/



---

[GitHub] carbondata issue #1674: [CARBONDATA-1859][CARBONDATA-1861][PARTITION] Suppor...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1674
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2188/



---

[GitHub] carbondata pull request #1674: [CARBONDATA-1859][CARBONDATA-1861][PARTITION]...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1674#discussion_r157676422
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/indexstore/BlockletDataMapIndexStore.java ---
    @@ -102,18 +105,26 @@ public BlockletDataMap get(TableBlockIndexUniqueIdentifier identifier)
           }
           if (missedIdentifiers.size() > 0) {
             Map<String, SegmentIndexFileStore> segmentIndexFileStoreMap = new HashMap<>();
    +        Map<String, PartitionMapFileStore> partitionFileStoreMap = new HashMap<>();
             for (TableBlockIndexUniqueIdentifier identifier: missedIdentifiers) {
               SegmentIndexFileStore indexFileStore =
                   segmentIndexFileStoreMap.get(identifier.getSegmentId());
    +          PartitionMapFileStore partitionFileStore =
    +              partitionFileStoreMap.get(identifier.getSegmentId());
    +          String segmentPath = CarbonTablePath
    --- End diff --
    
    change like:
    ```
    String segmentPath = CarbonTablePath.getSegmentPath(
         identifier.getAbsoluteTableIdentifier().getTablePath(),
         identifier.getSegmentId());
    ```


---

[GitHub] carbondata issue #1674: [CARBONDATA-1859][CARBONDATA-1861][PARTITION] Suppor...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1674
  
    Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/966/



---

[GitHub] carbondata issue #1674: [CARBONDATA-1859][CARBONDATA-1861] Support show and ...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1674
  
    Build Failed with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/829/



---

[GitHub] carbondata issue #1674: [CARBONDATA-1859][CARBONDATA-1861][PARTITION] Suppor...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1674
  
    Build Failed with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/886/



---

[GitHub] carbondata issue #1674: [CARBONDATA-1859][CARBONDATA-1861][PARTITION] Suppor...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1674
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2398/



---

[GitHub] carbondata issue #1674: [CARBONDATA-1859][CARBONDATA-1861] Support show and ...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1674
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2060/



---

[GitHub] carbondata issue #1674: [CARBONDATA-1859][CARBONDATA-1861][PARTITION] Suppor...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1674
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2403/



---

[GitHub] carbondata pull request #1674: [CARBONDATA-1859][CARBONDATA-1861][PARTITION]...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1674#discussion_r157742667
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMap.java ---
    @@ -491,6 +520,23 @@ public boolean isScanRequired(FilterResolverIntf filterExp) {
         return blocklets;
       }
     
    +  @Override public List<Blocklet> prune(FilterResolverIntf filterExp, List<String> partitions) {
    +    List<String> storedPartitions = getPartitions();
    --- End diff --
    
    ok, added


---

[GitHub] carbondata pull request #1674: [CARBONDATA-1859][CARBONDATA-1861][PARTITION]...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/carbondata/pull/1674


---

[GitHub] carbondata issue #1674: [CARBONDATA-1859][CARBONDATA-1861][PARTITION] Suppor...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1674
  
    retest sdv please


---

[GitHub] carbondata issue #1674: [CARBONDATA-1859][CARBONDATA-1861] Support show and ...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1674
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2056/



---

[GitHub] carbondata issue #1674: [CARBONDATA-1859][CARBONDATA-1861][PARTITION] Suppor...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1674
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2111/



---

[GitHub] carbondata issue #1674: [CARBONDATA-1859][CARBONDATA-1861][PARTITION] Suppor...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1674
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2166/



---

[GitHub] carbondata pull request #1674: [CARBONDATA-1859][CARBONDATA-1861][PARTITION]...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1674#discussion_r157677188
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMap.java ---
    @@ -379,10 +397,21 @@ private void createSchema(SegmentProperties segmentProperties) throws MemoryExce
             new UnsafeMemoryDMStore(indexSchemas.toArray(new CarbonRowSchema[indexSchemas.size()]));
       }
     
    -  private void createTaskMinMaxSchema(SegmentProperties segmentProperties) throws MemoryException {
    +  private void createSummarySchema(SegmentProperties segmentProperties, List<String> partitions)
    --- End diff --
    
    provide comment for this function


---

[GitHub] carbondata issue #1674: [CARBONDATA-1859][CARBONDATA-1861] Support show and ...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1674
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2365/



---

[GitHub] carbondata issue #1674: [CARBONDATA-1859][CARBONDATA-1861][PARTITION] Suppor...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1674
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2433/



---

[GitHub] carbondata issue #1674: [CARBONDATA-1859][CARBONDATA-1861][PARTITION] Suppor...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1674
  
    Build Failed with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/885/



---

[GitHub] carbondata issue #1674: [CARBONDATA-1859][CARBONDATA-1861][PARTITION] Suppor...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1674
  
    Build Failed with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/840/



---

[GitHub] carbondata pull request #1674: [CARBONDATA-1859][CARBONDATA-1861][PARTITION]...

Posted by gvramana <gi...@git.apache.org>.
Github user gvramana commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1674#discussion_r158022717
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/metadata/PartitionMapFileStore.java ---
    @@ -176,13 +179,87 @@ public PartitionMapper readPartitionMap(String partitionMapPath) {
       public void readAllPartitionsOfSegment(String segmentPath) {
         CarbonFile[] partitionFiles = getPartitionFiles(segmentPath);
         if (partitionFiles != null && partitionFiles.length > 0) {
    +      partionedSegment = true;
           for (CarbonFile file : partitionFiles) {
             PartitionMapper partitionMapper = readPartitionMap(file.getAbsolutePath());
             partitionMap.putAll(partitionMapper.getPartitionMap());
           }
         }
       }
     
    +  public boolean isPartionedSegment() {
    +    return partionedSegment;
    +  }
    +
    +  /**
    +   * Drops the partitions from the partition mapper file of the segment and writes to a new file.
    +   * @param segmentPath
    +   * @param partitionsToDrop
    +   * @param uniqueId
    +   * @throws IOException
    +   */
    +  public void dropPartitions(String segmentPath, List<String> partitionsToDrop, String uniqueId)
    --- End diff --
    
    1) clean files needs to be handled to delete partition drop.
    2) Reading paritionmap should adhere to segUpdatetimestamp/transactionid from tablestatus file and drop parition should create mapfile with segUpdatetimestamp and update it later in segmentstatus. After max query time during cleanup old files can be deleted. 
    This will avoid temporary state reading during query. 
    3) Drop partition done in one driver should also drop related partitions in cache in other driver. So there should be way to drop using segUpdatetimestamp/transactionid.
    4) Same thing applicable to InsertOverwrite scenario also.


---

[GitHub] carbondata pull request #1674: [CARBONDATA-1859][CARBONDATA-1861][PARTITION]...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1674#discussion_r157988312
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/metadata/PartitionMapFileStore.java ---
    @@ -176,13 +179,87 @@ public PartitionMapper readPartitionMap(String partitionMapPath) {
       public void readAllPartitionsOfSegment(String segmentPath) {
         CarbonFile[] partitionFiles = getPartitionFiles(segmentPath);
         if (partitionFiles != null && partitionFiles.length > 0) {
    +      partionedSegment = true;
           for (CarbonFile file : partitionFiles) {
             PartitionMapper partitionMapper = readPartitionMap(file.getAbsolutePath());
             partitionMap.putAll(partitionMapper.getPartitionMap());
           }
         }
       }
     
    +  public boolean isPartionedSegment() {
    +    return partionedSegment;
    +  }
    +
    +  /**
    +   * Drops the partitions from the partition mapper file of the segment and writes to a new file.
    +   * @param segmentPath
    +   * @param partitionsToDrop
    +   * @param uniqueId
    +   * @throws IOException
    +   */
    +  public void dropPartitions(String segmentPath, List<String> partitionsToDrop, String uniqueId)
    +      throws IOException {
    +    readAllPartitionsOfSegment(segmentPath);
    +    List<String> indexesToDrop = new ArrayList<>();
    +    for (Map.Entry<String, List<String>> entry: partitionMap.entrySet()) {
    +      for (String partition: partitionsToDrop) {
    +        if (entry.getValue().contains(partition)) {
    +          indexesToDrop.add(entry.getKey());
    +        }
    +      }
    +    }
    +    if (indexesToDrop.size() > 0) {
    +      // Remove the indexes from partition map
    +      for (String indexToDrop : indexesToDrop) {
    +        partitionMap.remove(indexToDrop);
    +      }
    +      PartitionMapper mapper = new PartitionMapper();
    +      mapper.setPartitionMap(partitionMap);
    +      String path = segmentPath + "/" + uniqueId + CarbonTablePath.PARTITION_MAP_EXT;
    +      writePartitionFile(mapper, path);
    +    }
    +  }
    +
    +  /**
    +   * It deletes the old partition mapper files in case of success. And in case of failure it removes
    +   * the old new file.
    +   * @param segmentPath
    +   * @param uniqueId
    +   * @param success
    +   */
    +  public void commitPartitions(String segmentPath, final String uniqueId, boolean success) {
    +    CarbonFile carbonFile = FileFactory.getCarbonFile(segmentPath);
    --- End diff --
    
    I think it is better not to take any lock inside processMetadata as some users run this command in different machine . So lock only taken in processData and if anything wrong happens we add back partition in undoMetadata command


---

[GitHub] carbondata issue #1674: [CARBONDATA-1859][CARBONDATA-1861][PARTITION] Suppor...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1674
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2109/



---

[GitHub] carbondata issue #1674: [CARBONDATA-1859][CARBONDATA-1861][PARTITION] Suppor...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1674
  
    Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/937/



---