You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@carbondata.apache.org by GitBox <gi...@apache.org> on 2021/12/22 09:54:32 UTC

[GitHub] [carbondata] vikramahuja1001 opened a new pull request #4245: [WIP] Fixed clean files not deleteting stale delete delta files after horizontal compaction

vikramahuja1001 opened a new pull request #4245:
URL: https://github.com/apache/carbondata/pull/4245


    ### Why is this PR needed?
    After horizontal compaction was performed on partition and non partition tables, the clean files operation was not deleting the stale delete delta files. the code was removed as the part of clean files refactoring done previously. 
    
    ### What changes were proposed in this PR?
   Clean files with force option now handles removal of these stale delta files as well as the stale tableupdatestatus file for both partition and non partition table. 
       
    ### Does this PR introduce any user interface change?
    - No
   
    ### Is any new testcase added?
    - Yes. 2 test cases have been added.
   
       
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4245: [CARBONDATA-4319] Fixed clean files not deleteting stale delete delta files after horizontal compaction

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4245:
URL: https://github.com/apache/carbondata/pull/4245#issuecomment-999541552


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/4436/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4245: [CARBONDATA-4319] Fixed clean files not deleteting stale delete delta files after horizontal compaction

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4245:
URL: https://github.com/apache/carbondata/pull/4245#issuecomment-999715336


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/6181/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] kunal642 commented on pull request #4245: [CARBONDATA-4319] Fixed clean files not deleteting stale delete delta files after horizontal compaction

Posted by GitBox <gi...@apache.org>.
kunal642 commented on pull request #4245:
URL: https://github.com/apache/carbondata/pull/4245#issuecomment-1002125020


   LGTM


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4245: [CARBONDATA-4319] Fixed clean files not deleteting stale delete delta files after horizontal compaction

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4245:
URL: https://github.com/apache/carbondata/pull/4245#issuecomment-999736017


   Build Success with Spark 3.1, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_3.1/572/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] asfgit closed pull request #4245: [CARBONDATA-4319] Fixed clean files not deleteting stale delete delta files after horizontal compaction

Posted by GitBox <gi...@apache.org>.
asfgit closed pull request #4245:
URL: https://github.com/apache/carbondata/pull/4245


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4245: [CARBONDATA-4319] Fixed clean files not deleteting stale delete delta files after horizontal compaction

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4245:
URL: https://github.com/apache/carbondata/pull/4245#issuecomment-1001950212


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/4445/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4245: [CARBONDATA-4319] Fixed clean files not deleteting stale delete delta files after horizontal compaction

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4245:
URL: https://github.com/apache/carbondata/pull/4245#issuecomment-1001949587


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/6189/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] vikramahuja1001 commented on a change in pull request #4245: [CARBONDATA-4319] Fixed clean files not deleteting stale delete delta files after horizontal compaction

Posted by GitBox <gi...@apache.org>.
vikramahuja1001 commented on a change in pull request #4245:
URL: https://github.com/apache/carbondata/pull/4245#discussion_r775725942



##########
File path: core/src/main/java/org/apache/carbondata/core/mutate/CarbonUpdateUtil.java
##########
@@ -688,4 +689,124 @@ public static long getLatestDeleteDeltaTimestamp(String[] deleteDeltaFiles) {
     }
     return latestTimestamp;
   }
+
+
+  /**
+   * Handling of the clean up of old carbondata files, index files , delete delta,
+   * update status files.
+   *
+   * @param table       clean up will be handled on this table.
+   * @param isDryRun if true then max query execution timeout will not be considered.
+   */
+  public static long cleanUpDeltaFiles(CarbonTable table, boolean isDryRun) throws IOException {

Review comment:
       this is for dry run config, i had put the wrong param description, changed it now.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4245: [CARBONDATA-4319] Fixed clean files not deleteting stale delete delta files after horizontal compaction

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4245:
URL: https://github.com/apache/carbondata/pull/4245#issuecomment-999531634


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/6180/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] kunal642 commented on a change in pull request #4245: [CARBONDATA-4319] Fixed clean files not deleteting stale delete delta files after horizontal compaction

Posted by GitBox <gi...@apache.org>.
kunal642 commented on a change in pull request #4245:
URL: https://github.com/apache/carbondata/pull/4245#discussion_r775422007



##########
File path: core/src/main/java/org/apache/carbondata/core/statusmanager/SegmentStatusManager.java
##########
@@ -855,6 +855,26 @@ private static void updateSegmentMetadataDetails(LoadMetadataDetails loadMetadat
     }
   }
 
+  /**
+   * This API will return the update status file name.
+   * @param segmentList
+   * @return
+   */
+  public String getUpdateStatusFileName(LoadMetadataDetails[] segmentList) {
+    if (segmentList.length == 0) {
+      return "";
+    }
+    else {
+      for (LoadMetadataDetails eachSeg : segmentList) {
+        // file name stored in 0th segment.
+        if (eachSeg.getLoadName().equalsIgnoreCase("0")) {
+          return eachSeg.getUpdateStatusFileName();

Review comment:
       why always segment 0?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] vikramahuja1001 commented on a change in pull request #4245: [CARBONDATA-4319] Fixed clean files not deleteting stale delete delta files after horizontal compaction

Posted by GitBox <gi...@apache.org>.
vikramahuja1001 commented on a change in pull request #4245:
URL: https://github.com/apache/carbondata/pull/4245#discussion_r775793018



##########
File path: core/src/main/java/org/apache/carbondata/core/statusmanager/SegmentStatusManager.java
##########
@@ -855,6 +855,26 @@ private static void updateSegmentMetadataDetails(LoadMetadataDetails loadMetadat
     }
   }
 
+  /**
+   * This API will return the update status file name.
+   * @param segmentList
+   * @return
+   */
+  public String getUpdateStatusFileName(LoadMetadataDetails[] segmentList) {
+    if (segmentList.length == 0) {
+      return "";
+    }
+    else {
+      for (LoadMetadataDetails eachSeg : segmentList) {
+        // file name stored in 0th segment.
+        if (eachSeg.getLoadName().equalsIgnoreCase("0")) {
+          return eachSeg.getUpdateStatusFileName();

Review comment:
       Only segment 0 will have the updateStatusFileName, for rest all it will be an empty string.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4245: [CARBONDATA-4319] Fixed clean files not deleteting stale delete delta files after horizontal compaction

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4245:
URL: https://github.com/apache/carbondata/pull/4245#issuecomment-1002007455


   Build Success with Spark 3.1, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_3.1/580/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] kunal642 commented on a change in pull request #4245: [CARBONDATA-4319] Fixed clean files not deleteting stale delete delta files after horizontal compaction

Posted by GitBox <gi...@apache.org>.
kunal642 commented on a change in pull request #4245:
URL: https://github.com/apache/carbondata/pull/4245#discussion_r775411417



##########
File path: core/src/main/java/org/apache/carbondata/core/mutate/CarbonUpdateUtil.java
##########
@@ -688,4 +689,124 @@ public static long getLatestDeleteDeltaTimestamp(String[] deleteDeltaFiles) {
     }
     return latestTimestamp;
   }
+
+
+  /**
+   * Handling of the clean up of old carbondata files, index files , delete delta,
+   * update status files.
+   *
+   * @param table       clean up will be handled on this table.
+   * @param isDryRun if true then max query execution timeout will not be considered.
+   */
+  public static long cleanUpDeltaFiles(CarbonTable table, boolean isDryRun) throws IOException {

Review comment:
       change isDryRun to forceDelete




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4245: [CARBONDATA-4319] Fixed clean files not deleteting stale delete delta files after horizontal compaction

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4245:
URL: https://github.com/apache/carbondata/pull/4245#issuecomment-999724753


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/4437/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4245: [CARBONDATA-4319] Fixed clean files not deleteting stale delete delta files after horizontal compaction

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4245:
URL: https://github.com/apache/carbondata/pull/4245#issuecomment-999528245


   Build Failed  with Spark 3.1, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_3.1/571/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] vikramahuja1001 commented on pull request #4245: [CARBONDATA-4319] Fixed clean files not deleteting stale delete delta files after horizontal compaction

Posted by GitBox <gi...@apache.org>.
vikramahuja1001 commented on pull request #4245:
URL: https://github.com/apache/carbondata/pull/4245#issuecomment-999610063


   retest this please


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org