You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@carbondata.apache.org by rahulforallp <gi...@git.apache.org> on 2018/04/01 12:13:54 UTC

[GitHub] carbondata pull request #2128: [WIP] partition table clean files fixed

GitHub user rahulforallp opened a pull request:

    https://github.com/apache/carbondata/pull/2128

    [WIP] partition table clean files fixed

    Be sure to do all of the following checklist to help us incorporate 
    your contribution quickly and easily:
    
     - [ ] Any interfaces changed?
     
     - [ ] Any backward compatibility impacted?
     
     - [ ] Document update required?
    
     - [ ] Testing done
            Please provide details on 
            - Whether new unit test cases have been added or why no new tests are required?
            - How it is tested? Please attach test report.
            - Is it a performance related change? Please attach the performance test report.
            - Any additional information to help reviewers in testing this change.
           
     - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. 
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/rahulforallp/incubator-carbondata part_tab_cleanFile

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/2128.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2128
    
----
commit 8044edb5afa858fa72ae7b2d0d1cf0685cf92597
Author: rahulforallp <ra...@...>
Date:   2018-04-01T12:08:51Z

    partition table clean files fixed

----


---

[GitHub] carbondata issue #2128: [CARBONDATA-2303] If dataload is failed for parition...

Posted by gvramana <gi...@git.apache.org>.
Github user gvramana commented on the issue:

    https://github.com/apache/carbondata/pull/2128
  
    LGTM


---

[GitHub] carbondata issue #2128: [CARBONDATA-2303] [WIP] If dataload is failed for pa...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2128
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/3564/



---

[GitHub] carbondata issue #2128: [CARBONDATA-2303] If dataload is failed for parition...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2128
  
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/3727/



---

[GitHub] carbondata issue #2128: [WIP] partition table clean files fixed

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2128
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4222/



---

[GitHub] carbondata issue #2128: [WIP] partition table clean files fixed

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2128
  
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/3486/



---

[GitHub] carbondata issue #2128: [CARBONDATA-2303] If dataload is failed for parition...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2128
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4359/



---

[GitHub] carbondata issue #2128: [WIP] partition table clean files fixed

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2128
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4220/



---

[GitHub] carbondata issue #2128: [CARBONDATA-2303] If dataload is failed for parition...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2128
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4401/



---

[GitHub] carbondata issue #2128: [CARBONDATA-2303] If dataload is failed for parition...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2128
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/3728/



---

[GitHub] carbondata issue #2128: [CARBONDATA-2303] If dataload is failed for parition...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2128
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/4947/



---

[GitHub] carbondata pull request #2128: [CARBONDATA-2303] If dataload is failed for p...

Posted by gvramana <gi...@git.apache.org>.
Github user gvramana commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2128#discussion_r180099123
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/filesystem/LocalCarbonFile.java ---
    @@ -156,6 +158,25 @@ public boolean delete() {
     
       }
     
    +  @Override
    +  public CarbonFile[] listFiles(Boolean recurssive) {
    +    if (!file.isDirectory()) {
    +      return new CarbonFile[0];
    +    }
    +    String[] filter = null;
    +    Collection<File> fileCollection = FileUtils.listFiles(file, null, true);
    +    File[] files = fileCollection.toArray(new File[fileCollection.size()]);
    +    if (files == null) {
    +      return new CarbonFile[0];
    +    }
    +    CarbonFile[] carbonFiles = new CarbonFile[files.length];
    --- End diff --
    
    directly copy into array


---

[GitHub] carbondata issue #2128: [CARBONDATA-2303] [WIP] If dataload is failed for pa...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2128
  
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/3556/



---

[GitHub] carbondata issue #2128: [CARBONDATA-2303] If dataload is failed for parition...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2128
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/4949/



---

[GitHub] carbondata issue #2128: [CARBONDATA-2303] [WIP] If dataload is failed for pa...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2128
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/4733/



---

[GitHub] carbondata issue #2128: [CARBONDATA-2303] If dataload is failed for parition...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2128
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/3577/



---

[GitHub] carbondata issue #2128: [WIP] partition table clean files fixed

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2128
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4225/



---

[GitHub] carbondata issue #2128: [CARBONDATA-2303] [WIP] If dataload is failed for pa...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2128
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/3506/



---

[GitHub] carbondata pull request #2128: [CARBONDATA-2303] If dataload is failed for p...

Posted by rahulforallp <gi...@git.apache.org>.
Github user rahulforallp commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2128#discussion_r180156407
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/filesystem/LocalCarbonFile.java ---
    @@ -156,6 +158,25 @@ public boolean delete() {
     
       }
     
    +  @Override
    +  public CarbonFile[] listFiles(Boolean recurssive) {
    +    if (!file.isDirectory()) {
    +      return new CarbonFile[0];
    +    }
    +    String[] filter = null;
    +    Collection<File> fileCollection = FileUtils.listFiles(file, null, true);
    +    File[] files = fileCollection.toArray(new File[fileCollection.size()]);
    +    if (files == null) {
    +      return new CarbonFile[0];
    +    }
    +    CarbonFile[] carbonFiles = new CarbonFile[files.length];
    --- End diff --
    
    done


---

[GitHub] carbondata issue #2128: [CARBONDATA-2303] If dataload is failed for parition...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2128
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4347/



---

[GitHub] carbondata issue #2128: [CARBONDATA-2303] [WIP] If dataload is failed for pa...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2128
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4258/



---

[GitHub] carbondata issue #2128: [WIP] partition table clean files fixed

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2128
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/4716/



---

[GitHub] carbondata pull request #2128: [CARBONDATA-2303] If dataload is failed for p...

Posted by rahulforallp <gi...@git.apache.org>.
Github user rahulforallp commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2128#discussion_r180156365
  
    --- Diff: integration/spark-common/src/main/scala/org/apache/carbondata/api/CarbonStore.scala ---
    @@ -151,13 +153,82 @@ object CarbonStore {
             }
           }
         } finally {
    +      if (currentTablePartitions.equals(None)) {
    +        cleanUpPartitionFoldersRecurssively(carbonTable, List.empty[PartitionSpec])
    +      } else {
    +        cleanUpPartitionFoldersRecurssively(carbonTable, currentTablePartitions.get.toList)
    +      }
    +
           if (carbonCleanFilesLock != null) {
             CarbonLockUtil.fileUnlock(carbonCleanFilesLock, LockUsage.CLEAN_FILES_LOCK)
           }
         }
         LOGGER.audit(s"Clean files operation is success for $dbName.$tableName.")
       }
     
    +  /**
    +   * delete partition folders recurssively
    +   *
    +   * @param carbonTable
    +   * @param partitionSpecList
    +   */
    +  def cleanUpPartitionFoldersRecurssively(carbonTable: CarbonTable,
    +      partitionSpecList: List[PartitionSpec]): Unit = {
    +    if (carbonTable != null) {
    +      val loadMetadataDetails = SegmentStatusManager
    +        .readLoadMetadata(carbonTable.getMetadataPath)
    +
    +      val fileType = FileFactory.getFileType(carbonTable.getTablePath)
    +      val carbonFile = FileFactory.getCarbonFile(carbonTable.getTablePath, fileType)
    +
    +      // list all files from table path
    +      val listOfDefaultPartFilesIterator = carbonFile.listFiles(true)
    +      loadMetadataDetails.foreach { metadataDetail =>
    +        if (metadataDetail.getSegmentStatus.equals(SegmentStatus.MARKED_FOR_DELETE) &&
    +            metadataDetail.getSegmentFile == null) {
    +          val loadStartTime: Long = metadataDetail.getLoadStartTime
    +          // delete all files of @loadStartTime from tablepath
    +          cleanPartitionFolder(listOfDefaultPartFilesIterator, loadStartTime)
    +          partitionSpecList.foreach {
    +            partitionSpec =>
    +              val partitionLocation = partitionSpec.getLocation
    +              // For partition folder outside the tablePath
    +              if (!partitionLocation.toString.startsWith(carbonTable.getTablePath)) {
    +                val fileType = FileFactory.getFileType(partitionLocation.toString)
    +                val partitionCarbonFile = FileFactory
    +                  .getCarbonFile(partitionLocation.toString, fileType)
    +                // list all files from partitionLoacation
    +                val listOfExternalPartFilesIterator = partitionCarbonFile.listFiles(true)
    +                // delete all files of @loadStartTime from externalPath
    +                cleanPartitionFolder(listOfExternalPartFilesIterator, loadStartTime)
    +              }
    +          }
    +        }
    +      }
    +    }
    +  }
    +
    +  /**
    +   *
    +   * @param carbonFiles
    +   * @param timestamp
    +   */
    +  private def cleanPartitionFolder(carbonFiles: Array[CarbonFile],
    +      timestamp: Long): Unit = {
    +    carbonFiles.foreach {
    +      carbonFile =>
    +        val filePath = carbonFile.getPath
    +        val fileName = carbonFile.getName
    +        if (fileName.lastIndexOf("-") > 0 && fileName.lastIndexOf(".") > 0) {
    +          if (fileName.substring(fileName.lastIndexOf("-") + 1, fileName.lastIndexOf("."))
    --- End diff --
    
    done


---

[GitHub] carbondata pull request #2128: [CARBONDATA-2303] If dataload is failed for p...

Posted by rahulforallp <gi...@git.apache.org>.
Github user rahulforallp commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2128#discussion_r180079486
  
    --- Diff: integration/spark-common/src/main/scala/org/apache/carbondata/api/CarbonStore.scala ---
    @@ -151,13 +152,88 @@ object CarbonStore {
             }
           }
         } finally {
    +      if (currentTablePartitions.equals(None)) {
    +        cleanUpPartitionFoldersRecurssively(carbonTable, List.empty[PartitionSpec])
    +      } else {
    +        cleanUpPartitionFoldersRecurssively(carbonTable, currentTablePartitions.get.toList)
    +      }
    +
           if (carbonCleanFilesLock != null) {
             CarbonLockUtil.fileUnlock(carbonCleanFilesLock, LockUsage.CLEAN_FILES_LOCK)
           }
         }
         LOGGER.audit(s"Clean files operation is success for $dbName.$tableName.")
       }
     
    +  /**
    +   * delete partition folders recurssively
    +   *
    +   * @param carbonTable
    +   * @param partitionSpecList
    +   */
    +  def cleanUpPartitionFoldersRecurssively(carbonTable: CarbonTable,
    +      partitionSpecList: List[PartitionSpec]): Unit = {
    +    if (carbonTable != null) {
    +      val loadMetadataDetails = SegmentStatusManager
    --- End diff --
    
    1. partition folders cannot be deleted, as there is no way to check if new dataload is using them. ==> Done
    2. Shouldnot take multiple snapshots of file system during clean files. ==> earlier we are not taking snapshot recurssively . so it required here for partition folders.
    3. Partition location will be valid for partitions inside table path also, those folders should not be scanned twice. ==> Done
    4. CarbonFile interface should be used for filesystem operations. ==> Done


---

[GitHub] carbondata issue #2128: [CARBONDATA-2303] If dataload is failed for parition...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2128
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4404/



---

[GitHub] carbondata issue #2128: [WIP] partition table clean files fixed

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2128
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/4719/



---

[GitHub] carbondata issue #2128: [CARBONDATA-2303] If dataload is failed for parition...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2128
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/4889/



---

[GitHub] carbondata issue #2128: [CARBONDATA-2303] If dataload is failed for parition...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2128
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4340/



---

[GitHub] carbondata issue #2128: [CARBONDATA-2303] If dataload is failed for parition...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2128
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4364/



---

[GitHub] carbondata issue #2128: [CARBONDATA-2303] If dataload is failed for parition...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2128
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/4871/



---

[GitHub] carbondata issue #2128: [CARBONDATA-2303] If dataload is failed for parition...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2128
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/3655/



---

[GitHub] carbondata issue #2128: [CARBONDATA-2303] [WIP] If dataload is failed for pa...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2128
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/4788/



---

[GitHub] carbondata issue #2128: [CARBONDATA-2303] If dataload is failed for parition...

Posted by rahulforallp <gi...@git.apache.org>.
Github user rahulforallp commented on the issue:

    https://github.com/apache/carbondata/pull/2128
  
    retest this please


---

[GitHub] carbondata issue #2128: [WIP] partition table clean files fixed

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2128
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4226/



---

[GitHub] carbondata issue #2128: [CARBONDATA-2303] If dataload is failed for parition...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2128
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4403/



---

[GitHub] carbondata issue #2128: [CARBONDATA-2303] [WIP] If dataload is failed for pa...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2128
  
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/3525/



---

[GitHub] carbondata issue #2128: [CARBONDATA-2303] If dataload is failed for parition...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2128
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/3672/



---

[GitHub] carbondata issue #2128: [CARBONDATA-2303] If dataload is failed for parition...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2128
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/4895/



---

[GitHub] carbondata issue #2128: [WIP] partition table clean files fixed

Posted by rahulforallp <gi...@git.apache.org>.
Github user rahulforallp commented on the issue:

    https://github.com/apache/carbondata/pull/2128
  
    retest this please


---

[GitHub] carbondata issue #2128: [CARBONDATA-2303] [WIP] If dataload is failed for pa...

Posted by rahulforallp <gi...@git.apache.org>.
Github user rahulforallp commented on the issue:

    https://github.com/apache/carbondata/pull/2128
  
    retest sdv please


---

[GitHub] carbondata issue #2128: [CARBONDATA-2303] [WIP] If dataload is failed for pa...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2128
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4284/



---

[GitHub] carbondata issue #2128: [WIP] partition table clean files fixed

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2128
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4223/



---

[GitHub] carbondata issue #2128: [CARBONDATA-2303] If dataload is failed for parition...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2128
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/3649/



---

[GitHub] carbondata pull request #2128: [CARBONDATA-2303] If dataload is failed for p...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/carbondata/pull/2128


---

[GitHub] carbondata issue #2128: [CARBONDATA-2303] If dataload is failed for parition...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2128
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/3730/



---

[GitHub] carbondata issue #2128: [CARBONDATA-2303] If dataload is failed for parition...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2128
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/3666/



---

[GitHub] carbondata issue #2128: [CARBONDATA-2303] [WIP] If dataload is failed for pa...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2128
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/4752/



---

[GitHub] carbondata issue #2128: [CARBONDATA-2303] [WIP] If dataload is failed for pa...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2128
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4245/



---

[GitHub] carbondata issue #2128: [WIP] partition table clean files fixed

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2128
  
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/3492/



---

[GitHub] carbondata issue #2128: [CARBONDATA-2303] If dataload is failed for parition...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2128
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/4878/



---

[GitHub] carbondata issue #2128: [CARBONDATA-2303] If dataload is failed for parition...

Posted by rahulforallp <gi...@git.apache.org>.
Github user rahulforallp commented on the issue:

    https://github.com/apache/carbondata/pull/2128
  
    retest this please


---

[GitHub] carbondata pull request #2128: [CARBONDATA-2303] If dataload is failed for p...

Posted by gvramana <gi...@git.apache.org>.
Github user gvramana commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2128#discussion_r180109707
  
    --- Diff: integration/spark-common/src/main/scala/org/apache/carbondata/api/CarbonStore.scala ---
    @@ -151,13 +153,82 @@ object CarbonStore {
             }
           }
         } finally {
    +      if (currentTablePartitions.equals(None)) {
    +        cleanUpPartitionFoldersRecurssively(carbonTable, List.empty[PartitionSpec])
    +      } else {
    +        cleanUpPartitionFoldersRecurssively(carbonTable, currentTablePartitions.get.toList)
    +      }
    +
           if (carbonCleanFilesLock != null) {
             CarbonLockUtil.fileUnlock(carbonCleanFilesLock, LockUsage.CLEAN_FILES_LOCK)
           }
         }
         LOGGER.audit(s"Clean files operation is success for $dbName.$tableName.")
       }
     
    +  /**
    +   * delete partition folders recurssively
    +   *
    +   * @param carbonTable
    +   * @param partitionSpecList
    +   */
    +  def cleanUpPartitionFoldersRecurssively(carbonTable: CarbonTable,
    +      partitionSpecList: List[PartitionSpec]): Unit = {
    +    if (carbonTable != null) {
    +      val loadMetadataDetails = SegmentStatusManager
    +        .readLoadMetadata(carbonTable.getMetadataPath)
    +
    +      val fileType = FileFactory.getFileType(carbonTable.getTablePath)
    +      val carbonFile = FileFactory.getCarbonFile(carbonTable.getTablePath, fileType)
    +
    +      // list all files from table path
    +      val listOfDefaultPartFilesIterator = carbonFile.listFiles(true)
    +      loadMetadataDetails.foreach { metadataDetail =>
    +        if (metadataDetail.getSegmentStatus.equals(SegmentStatus.MARKED_FOR_DELETE) &&
    +            metadataDetail.getSegmentFile == null) {
    +          val loadStartTime: Long = metadataDetail.getLoadStartTime
    +          // delete all files of @loadStartTime from tablepath
    +          cleanPartitionFolder(listOfDefaultPartFilesIterator, loadStartTime)
    +          partitionSpecList.foreach {
    +            partitionSpec =>
    +              val partitionLocation = partitionSpec.getLocation
    +              // For partition folder outside the tablePath
    +              if (!partitionLocation.toString.startsWith(carbonTable.getTablePath)) {
    +                val fileType = FileFactory.getFileType(partitionLocation.toString)
    +                val partitionCarbonFile = FileFactory
    +                  .getCarbonFile(partitionLocation.toString, fileType)
    +                // list all files from partitionLoacation
    +                val listOfExternalPartFilesIterator = partitionCarbonFile.listFiles(true)
    +                // delete all files of @loadStartTime from externalPath
    +                cleanPartitionFolder(listOfExternalPartFilesIterator, loadStartTime)
    +              }
    +          }
    +        }
    +      }
    +    }
    +  }
    +
    +  /**
    +   *
    +   * @param carbonFiles
    +   * @param timestamp
    +   */
    +  private def cleanPartitionFolder(carbonFiles: Array[CarbonFile],
    +      timestamp: Long): Unit = {
    +    carbonFiles.foreach {
    +      carbonFile =>
    +        val filePath = carbonFile.getPath
    +        val fileName = carbonFile.getName
    +        if (fileName.lastIndexOf("-") > 0 && fileName.lastIndexOf(".") > 0) {
    +          if (fileName.substring(fileName.lastIndexOf("-") + 1, fileName.lastIndexOf("."))
    --- End diff --
    
    move getCarbonFileTimeStamp function can be moved to CarbonTablePath
    Change function name to cleanCarbonFilesInFolder


---

[GitHub] carbondata issue #2128: [WIP] partition table clean files fixed

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2128
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/4713/



---

[GitHub] carbondata issue #2128: [CARBONDATA-2303] If dataload is failed for parition...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2128
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/4800/



---

[GitHub] carbondata pull request #2128: [CARBONDATA-2303] If dataload is failed for p...

Posted by gvramana <gi...@git.apache.org>.
Github user gvramana commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2128#discussion_r180034323
  
    --- Diff: integration/spark-common/src/main/scala/org/apache/carbondata/api/CarbonStore.scala ---
    @@ -151,13 +152,88 @@ object CarbonStore {
             }
           }
         } finally {
    +      if (currentTablePartitions.equals(None)) {
    +        cleanUpPartitionFoldersRecurssively(carbonTable, List.empty[PartitionSpec])
    +      } else {
    +        cleanUpPartitionFoldersRecurssively(carbonTable, currentTablePartitions.get.toList)
    +      }
    +
           if (carbonCleanFilesLock != null) {
             CarbonLockUtil.fileUnlock(carbonCleanFilesLock, LockUsage.CLEAN_FILES_LOCK)
           }
         }
         LOGGER.audit(s"Clean files operation is success for $dbName.$tableName.")
       }
     
    +  /**
    +   * delete partition folders recurssively
    +   *
    +   * @param carbonTable
    +   * @param partitionSpecList
    +   */
    +  def cleanUpPartitionFoldersRecurssively(carbonTable: CarbonTable,
    +      partitionSpecList: List[PartitionSpec]): Unit = {
    +    if (carbonTable != null) {
    +      val loadMetadataDetails = SegmentStatusManager
    --- End diff --
    
    1. partition folders cannot be deleted, as there is no way to check if new dataload is using them.
    2. Shouldnot take multiple snapshots of file system during clean files.
    3. Partition location will be valid for partitions inside table path also, those folders should not be scanned twice.
    4. CarbonFile interface should be used for filesystem operations.


---

[GitHub] carbondata issue #2128: [WIP] partition table clean files fixed

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2128
  
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/3489/



---

[GitHub] carbondata issue #2128: [CARBONDATA-2303] If dataload is failed for parition...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2128
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/4946/



---

[GitHub] carbondata issue #2128: [CARBONDATA-2303] [WIP] If dataload is failed for pa...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2128
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/4782/



---

[GitHub] carbondata issue #2128: [CARBONDATA-2303] If dataload is failed for parition...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2128
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4290/



---