You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@carbondata.apache.org by gvramana <gi...@git.apache.org> on 2018/04/09 09:23:07 UTC

[GitHub] carbondata pull request #2128: [CARBONDATA-2303] If dataload is failed for p...

Github user gvramana commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2128#discussion_r180034323
  
    --- Diff: integration/spark-common/src/main/scala/org/apache/carbondata/api/CarbonStore.scala ---
    @@ -151,13 +152,88 @@ object CarbonStore {
             }
           }
         } finally {
    +      if (currentTablePartitions.equals(None)) {
    +        cleanUpPartitionFoldersRecurssively(carbonTable, List.empty[PartitionSpec])
    +      } else {
    +        cleanUpPartitionFoldersRecurssively(carbonTable, currentTablePartitions.get.toList)
    +      }
    +
           if (carbonCleanFilesLock != null) {
             CarbonLockUtil.fileUnlock(carbonCleanFilesLock, LockUsage.CLEAN_FILES_LOCK)
           }
         }
         LOGGER.audit(s"Clean files operation is success for $dbName.$tableName.")
       }
     
    +  /**
    +   * delete partition folders recurssively
    +   *
    +   * @param carbonTable
    +   * @param partitionSpecList
    +   */
    +  def cleanUpPartitionFoldersRecurssively(carbonTable: CarbonTable,
    +      partitionSpecList: List[PartitionSpec]): Unit = {
    +    if (carbonTable != null) {
    +      val loadMetadataDetails = SegmentStatusManager
    --- End diff --
    
    1. partition folders cannot be deleted, as there is no way to check if new dataload is using them.
    2. Shouldnot take multiple snapshots of file system during clean files.
    3. Partition location will be valid for partitions inside table path also, those folders should not be scanned twice.
    4. CarbonFile interface should be used for filesystem operations.


---