You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by "klsince (via GitHub)" <gi...@apache.org> on 2023/06/05 21:14:09 UTC
[GitHub] [pinot] klsince opened a new pull request, #10847: clean up output files upon exceptions more properly for SegmentProcessorFramework
klsince opened a new pull request, #10847:
URL: https://github.com/apache/pinot/pull/10847
The previous fix #9797 on leaked files in SegmentProcessorFramework was not enough, as the mapper and reducer may also generate output files and hold on them via mmap, making them leaked upon exceptions thrown from map or reduce phases, e.g. out of disk during mapping phase.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
[GitHub] [pinot] gortiz commented on a diff in pull request #10847: clean up output files upon exceptions more properly for SegmentProcessorFramework
Posted by "gortiz (via GitHub)" <gi...@apache.org>.
gortiz commented on code in PR #10847:
URL: https://github.com/apache/pinot/pull/10847#discussion_r1218681132
##########
pinot-core/src/main/java/org/apache/pinot/core/segment/processing/framework/SegmentProcessorFramework.java:
##########
@@ -91,33 +92,51 @@ public SegmentProcessorFramework(List<RecordReader> recordReaders, SegmentProces
*/
public List<File> process()
throws Exception {
+ try {
+ return doProcess();
+ } catch (Exception e) {
+ if (_partitionToFileManagerMap != null) {
+ for (GenericRowFileManager fileManager : _partitionToFileManagerMap.values()) {
+ fileManager.cleanUp();
+ }
+ }
+ throw new RuntimeException("Failed to complete process", e);
Review Comment:
Why changing the exception? you should be able to do `throw e;`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
[GitHub] [pinot] gortiz commented on a diff in pull request #10847: clean up output files upon exceptions more properly for SegmentProcessorFramework
Posted by "gortiz (via GitHub)" <gi...@apache.org>.
gortiz commented on code in PR #10847:
URL: https://github.com/apache/pinot/pull/10847#discussion_r1218682502
##########
pinot-core/src/main/java/org/apache/pinot/core/segment/processing/framework/SegmentProcessorFramework.java:
##########
@@ -91,33 +92,51 @@ public SegmentProcessorFramework(List<RecordReader> recordReaders, SegmentProces
*/
public List<File> process()
throws Exception {
+ try {
+ return doProcess();
+ } catch (Exception e) {
+ if (_partitionToFileManagerMap != null) {
+ for (GenericRowFileManager fileManager : _partitionToFileManagerMap.values()) {
+ fileManager.cleanUp();
+ }
+ }
+ throw new RuntimeException("Failed to complete process", e);
Review Comment:
Same in `SegmentMapper` and other classes
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
[GitHub] [pinot] codecov-commenter commented on pull request #10847: clean up output files upon exceptions more properly for SegmentProcessorFramework
Posted by "codecov-commenter (via GitHub)" <gi...@apache.org>.
codecov-commenter commented on PR #10847:
URL: https://github.com/apache/pinot/pull/10847#issuecomment-1577579268
## [Codecov](https://app.codecov.io/gh/apache/pinot/pull/10847?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) Report
> Merging [#10847](https://app.codecov.io/gh/apache/pinot/pull/10847?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) (5351574) into [master](https://app.codecov.io/gh/apache/pinot/commit/84688577e40ef34dcb2aadf558ec759ca80466da?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) (8468857) will **decrease** coverage by `0.04%`.
> The diff coverage is `57.77%`.
```diff
@@ Coverage Diff @@
## master #10847 +/- ##
============================================
- Coverage 70.23% 70.20% -0.04%
+ Complexity 6585 5687 -898
============================================
Files 2170 2170
Lines 116665 116712 +47
Branches 17656 17666 +10
============================================
- Hits 81942 81937 -5
- Misses 29014 29065 +51
- Partials 5709 5710 +1
```
| Flag | Coverage Δ | |
|---|---|---|
| integration1 | `24.01% <40.00%> (-0.01%)` | :arrow_down: |
| integration2 | `23.79% <28.88%> (+0.19%)` | :arrow_up: |
| unittests1 | `67.75% <57.77%> (+<0.01%)` | :arrow_up: |
| unittests2 | `13.61% <0.00%> (+<0.01%)` | :arrow_up: |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://app.codecov.io/gh/apache/pinot/pull/10847?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | Coverage Δ | |
|---|---|---|
| [.../core/segment/processing/mapper/SegmentMapper.java](https://app.codecov.io/gh/apache/pinot/pull/10847?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9zZWdtZW50L3Byb2Nlc3NpbmcvbWFwcGVyL1NlZ21lbnRNYXBwZXIuamF2YQ==) | `82.85% <16.66%> (-6.21%)` | :arrow_down: |
| [.../core/segment/processing/reducer/DedupReducer.java](https://app.codecov.io/gh/apache/pinot/pull/10847?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9zZWdtZW50L3Byb2Nlc3NpbmcvcmVkdWNlci9EZWR1cFJlZHVjZXIuamF2YQ==) | `90.47% <55.55%> (-9.53%)` | :arrow_down: |
| [...core/segment/processing/reducer/RollupReducer.java](https://app.codecov.io/gh/apache/pinot/pull/10847?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9zZWdtZW50L3Byb2Nlc3NpbmcvcmVkdWNlci9Sb2xsdXBSZWR1Y2VyLmphdmE=) | `95.40% <55.55%> (-4.60%)` | :arrow_down: |
| [...rocessing/framework/SegmentProcessorFramework.java](https://app.codecov.io/gh/apache/pinot/pull/10847?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9zZWdtZW50L3Byb2Nlc3NpbmcvZnJhbWV3b3JrL1NlZ21lbnRQcm9jZXNzb3JGcmFtZXdvcmsuamF2YQ==) | `91.66% <71.42%> (-7.02%)` | :arrow_down: |
... and [35 files with indirect coverage changes](https://app.codecov.io/gh/apache/pinot/pull/10847/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
:mega: We’re building smart automated test selection to slash your CI/CD build times. [Learn more](https://about.codecov.io/iterative-testing/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
[GitHub] [pinot] klsince commented on a diff in pull request #10847: clean up output files upon exceptions more properly for SegmentProcessorFramework
Posted by "klsince (via GitHub)" <gi...@apache.org>.
klsince commented on code in PR #10847:
URL: https://github.com/apache/pinot/pull/10847#discussion_r1218737635
##########
pinot-core/src/main/java/org/apache/pinot/core/segment/processing/framework/SegmentProcessorFramework.java:
##########
@@ -91,33 +92,51 @@ public SegmentProcessorFramework(List<RecordReader> recordReaders, SegmentProces
*/
public List<File> process()
throws Exception {
+ try {
+ return doProcess();
+ } catch (Exception e) {
+ if (_partitionToFileManagerMap != null) {
Review Comment:
good point. think we should, as that's resources created by this processor.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
[GitHub] [pinot] gortiz commented on a diff in pull request #10847: clean up output files upon exceptions more properly for SegmentProcessorFramework
Posted by "gortiz (via GitHub)" <gi...@apache.org>.
gortiz commented on code in PR #10847:
URL: https://github.com/apache/pinot/pull/10847#discussion_r1218737763
##########
pinot-core/src/main/java/org/apache/pinot/core/segment/processing/framework/SegmentProcessorFramework.java:
##########
@@ -91,33 +92,51 @@ public SegmentProcessorFramework(List<RecordReader> recordReaders, SegmentProces
*/
public List<File> process()
throws Exception {
+ try {
+ return doProcess();
+ } catch (Exception e) {
+ if (_partitionToFileManagerMap != null) {
+ for (GenericRowFileManager fileManager : _partitionToFileManagerMap.values()) {
+ fileManager.cleanUp();
+ }
+ }
+ throw new RuntimeException("Failed to complete process", e);
Review Comment:
I don't know. I would prefer to get the specific exception why that fail instead of a generic runtime exception and having to check the cause. But it is not a blocker, if you think that it is better to throw the runtime, it is ok to me.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
[GitHub] [pinot] swaminathanmanish commented on a diff in pull request #10847: clean up output files upon exceptions more properly for SegmentProcessorFramework
Posted by "swaminathanmanish (via GitHub)" <gi...@apache.org>.
swaminathanmanish commented on code in PR #10847:
URL: https://github.com/apache/pinot/pull/10847#discussion_r1218686650
##########
pinot-core/src/main/java/org/apache/pinot/core/segment/processing/framework/SegmentProcessorFramework.java:
##########
@@ -91,33 +92,51 @@ public SegmentProcessorFramework(List<RecordReader> recordReaders, SegmentProces
*/
public List<File> process()
throws Exception {
+ try {
+ return doProcess();
+ } catch (Exception e) {
+ if (_partitionToFileManagerMap != null) {
Review Comment:
Should we clean up _segmentsOutputDir as well ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
[GitHub] [pinot] Jackie-Jiang merged pull request #10847: clean up output files upon exceptions more properly for SegmentProcessorFramework
Posted by "Jackie-Jiang (via GitHub)" <gi...@apache.org>.
Jackie-Jiang merged PR #10847:
URL: https://github.com/apache/pinot/pull/10847
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
[GitHub] [pinot] klsince commented on a diff in pull request #10847: clean up output files upon exceptions more properly for SegmentProcessorFramework
Posted by "klsince (via GitHub)" <gi...@apache.org>.
klsince commented on code in PR #10847:
URL: https://github.com/apache/pinot/pull/10847#discussion_r1218737175
##########
pinot-core/src/main/java/org/apache/pinot/core/segment/processing/framework/SegmentProcessorFramework.java:
##########
@@ -91,33 +92,51 @@ public SegmentProcessorFramework(List<RecordReader> recordReaders, SegmentProces
*/
public List<File> process()
throws Exception {
+ try {
+ return doProcess();
+ } catch (Exception e) {
+ if (_partitionToFileManagerMap != null) {
+ for (GenericRowFileManager fileManager : _partitionToFileManagerMap.values()) {
+ fileManager.cleanUp();
+ }
+ }
+ throw new RuntimeException("Failed to complete process", e);
Review Comment:
I was thinking to add a bit more context before bubbling it up
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
[GitHub] [pinot] gortiz commented on a diff in pull request #10847: clean up output files upon exceptions more properly for SegmentProcessorFramework
Posted by "gortiz (via GitHub)" <gi...@apache.org>.
gortiz commented on code in PR #10847:
URL: https://github.com/apache/pinot/pull/10847#discussion_r1218682502
##########
pinot-core/src/main/java/org/apache/pinot/core/segment/processing/framework/SegmentProcessorFramework.java:
##########
@@ -91,33 +92,51 @@ public SegmentProcessorFramework(List<RecordReader> recordReaders, SegmentProces
*/
public List<File> process()
throws Exception {
+ try {
+ return doProcess();
+ } catch (Exception e) {
+ if (_partitionToFileManagerMap != null) {
+ for (GenericRowFileManager fileManager : _partitionToFileManagerMap.values()) {
+ fileManager.cleanUp();
+ }
+ }
+ throw new RuntimeException("Failed to complete process", e);
Review Comment:
Same in `SegmentMapper`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
[GitHub] [pinot] klsince commented on a diff in pull request #10847: clean up output files upon exceptions more properly for SegmentProcessorFramework
Posted by "klsince (via GitHub)" <gi...@apache.org>.
klsince commented on code in PR #10847:
URL: https://github.com/apache/pinot/pull/10847#discussion_r1218884702
##########
pinot-core/src/main/java/org/apache/pinot/core/segment/processing/framework/SegmentProcessorFramework.java:
##########
@@ -91,33 +92,51 @@ public SegmentProcessorFramework(List<RecordReader> recordReaders, SegmentProces
*/
public List<File> process()
throws Exception {
+ try {
+ return doProcess();
+ } catch (Exception e) {
+ if (_partitionToFileManagerMap != null) {
+ for (GenericRowFileManager fileManager : _partitionToFileManagerMap.values()) {
+ fileManager.cleanUp();
+ }
+ }
+ throw new RuntimeException("Failed to complete process", e);
Review Comment:
updated to throw e directly, as the extra context msg passed to RuntimeException didn't carry much info anway.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org