You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@carbondata.apache.org by GitBox <gi...@apache.org> on 2021/10/09 10:35:34 UTC
[GitHub] [carbondata] jack86596 opened a new pull request #4232: [CARBONDATA-4300] Clean files command supports specify segment ids
jack86596 opened a new pull request #4232:
URL: https://github.com/apache/carbondata/pull/4232
### Why is this PR needed?
Currently clean files command will delete all the Marked for Deleted and Compacted segments after the number of theses segments reaches carbon.invisible.segments.preserve.count, this delete operation may take lots of time and user cannot decide to only delete some of these segments. It is better to enhance clean files command to allow specify the segments to be deleted.
### What changes were proposed in this PR?
1. Clean files command supports specify segment ids, syntax is "clean files for table table_name options("segment_ids"="id1,id2,id3...")". If specified segment ids, then only the segment with these ids will be delete physically.
2. Refactoring lock taken: during clean files, take the tablestatus lock at the begining and release the lock at the end, and during lock taken period, only read tablestatus file one time(before there could be 10+) and all operations are done on it like change the visibility of segment, move visibility = false segment to tablestatus.history file.
### Does this PR introduce any user interface change?
- Yes. One more option is added for clean files command: segment_ids. Value is the segment ids user wants to delete. Only Marked for Delete and Compacted segment ids are valid. If invalid ids are given, operation will fail directly. If segments are specified, force option will be ignored.
CLEAN FILES FOR TABLE TABLE_NAME options('segment_ids'='0,1,2')
### Is any new testcase added?
- Yes
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4232: [CARBONDATA-4300] Clean files command supports specify segment ids
Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4232:
URL: https://github.com/apache/carbondata/pull/4232#issuecomment-939899789
Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/4300/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4232: [CARBONDATA-4300] Clean files command supports specify segment ids
Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4232:
URL: https://github.com/apache/carbondata/pull/4232#issuecomment-939895656
Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/6044/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4232: [CARBONDATA-4300] Clean files command supports specify segment ids
Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4232:
URL: https://github.com/apache/carbondata/pull/4232#issuecomment-939275677
Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/6042/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4232: [CARBONDATA-4300] Clean files command supports specify segment ids
Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4232:
URL: https://github.com/apache/carbondata/pull/4232#issuecomment-939915645
Build Success with Spark 3.1, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_3.1/434/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] jack86596 commented on pull request #4232: [CARBONDATA-4300] Clean files command supports specify segment ids
Posted by GitBox <gi...@apache.org>.
jack86596 commented on pull request #4232:
URL: https://github.com/apache/carbondata/pull/4232#issuecomment-939771668
retest this please
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] akashrn5 commented on pull request #4232: [CARBONDATA-4300] Clean files command supports specify segment ids
Posted by GitBox <gi...@apache.org>.
akashrn5 commented on pull request #4232:
URL: https://github.com/apache/carbondata/pull/4232#issuecomment-943067945
@jack86596 this is behavioral and functional change. So instead of directly raising PR with more code changes, better to first raise discussion in community and take the inputs and then do the changes accordingly.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4232: [CARBONDATA-4300] Clean files command supports specify segment ids
Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4232:
URL: https://github.com/apache/carbondata/pull/4232#issuecomment-939347490
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] jack86596 edited a comment on pull request #4232: [CARBONDATA-4300] Clean files command supports specify segment ids
Posted by GitBox <gi...@apache.org>.
jack86596 edited a comment on pull request #4232:
URL: https://github.com/apache/carbondata/pull/4232#issuecomment-943071877
OK, i will raise a discussion in the mail list.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] jack86596 commented on pull request #4232: [CARBONDATA-4300] Clean files command supports specify segment ids
Posted by GitBox <gi...@apache.org>.
jack86596 commented on pull request #4232:
URL: https://github.com/apache/carbondata/pull/4232#issuecomment-943071877
OK, i will raise a discussion in the main list.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4232: [CARBONDATA-4300] Clean files command supports specify segment ids
Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4232:
URL: https://github.com/apache/carbondata/pull/4232#issuecomment-939275841
Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/4298/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4232: [CARBONDATA-4300] Clean files command supports specify segment ids
Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4232:
URL: https://github.com/apache/carbondata/pull/4232#issuecomment-939352436
Build Success with Spark 3.1, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_3.1/433/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4232: [CARBONDATA-4300] Clean files command supports specify segment ids
Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4232:
URL: https://github.com/apache/carbondata/pull/4232#issuecomment-939275813
Build Failed with Spark 3.1, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_3.1/432/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org