You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by "klsince (via GitHub)" <gi...@apache.org> on 2023/06/05 21:14:09 UTC

[GitHub] [pinot] klsince opened a new pull request, #10847: clean up output files upon exceptions more properly for SegmentProcessorFramework

klsince opened a new pull request, #10847:
URL: https://github.com/apache/pinot/pull/10847

   The previous fix #9797 on leaked files in SegmentProcessorFramework was not enough, as the mapper and reducer may also generate output files and hold on them via mmap, making them leaked upon exceptions thrown from map or reduce phases, e.g. out of disk during mapping phase.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] gortiz commented on a diff in pull request #10847: clean up output files upon exceptions more properly for SegmentProcessorFramework

Posted by "gortiz (via GitHub)" <gi...@apache.org>.
gortiz commented on code in PR #10847:
URL: https://github.com/apache/pinot/pull/10847#discussion_r1218681132


##########
pinot-core/src/main/java/org/apache/pinot/core/segment/processing/framework/SegmentProcessorFramework.java:
##########
@@ -91,33 +92,51 @@ public SegmentProcessorFramework(List<RecordReader> recordReaders, SegmentProces
    */
   public List<File> process()
       throws Exception {
+    try {
+      return doProcess();
+    } catch (Exception e) {
+      if (_partitionToFileManagerMap != null) {
+        for (GenericRowFileManager fileManager : _partitionToFileManagerMap.values()) {
+          fileManager.cleanUp();
+        }
+      }
+      throw new RuntimeException("Failed to complete process", e);

Review Comment:
   Why changing the exception? you should be able to do `throw e;`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] gortiz commented on a diff in pull request #10847: clean up output files upon exceptions more properly for SegmentProcessorFramework

Posted by "gortiz (via GitHub)" <gi...@apache.org>.
gortiz commented on code in PR #10847:
URL: https://github.com/apache/pinot/pull/10847#discussion_r1218682502


##########
pinot-core/src/main/java/org/apache/pinot/core/segment/processing/framework/SegmentProcessorFramework.java:
##########
@@ -91,33 +92,51 @@ public SegmentProcessorFramework(List<RecordReader> recordReaders, SegmentProces
    */
   public List<File> process()
       throws Exception {
+    try {
+      return doProcess();
+    } catch (Exception e) {
+      if (_partitionToFileManagerMap != null) {
+        for (GenericRowFileManager fileManager : _partitionToFileManagerMap.values()) {
+          fileManager.cleanUp();
+        }
+      }
+      throw new RuntimeException("Failed to complete process", e);

Review Comment:
   Same in `SegmentMapper` and other classes



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] codecov-commenter commented on pull request #10847: clean up output files upon exceptions more properly for SegmentProcessorFramework

Posted by "codecov-commenter (via GitHub)" <gi...@apache.org>.
codecov-commenter commented on PR #10847:
URL: https://github.com/apache/pinot/pull/10847#issuecomment-1577579268

   ## [Codecov](https://app.codecov.io/gh/apache/pinot/pull/10847?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) Report
   > Merging [#10847](https://app.codecov.io/gh/apache/pinot/pull/10847?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) (5351574) into [master](https://app.codecov.io/gh/apache/pinot/commit/84688577e40ef34dcb2aadf558ec759ca80466da?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) (8468857) will **decrease** coverage by `0.04%`.
   > The diff coverage is `57.77%`.
   
   ```diff
   @@             Coverage Diff              @@
   ##             master   #10847      +/-   ##
   ============================================
   - Coverage     70.23%   70.20%   -0.04%     
   + Complexity     6585     5687     -898     
   ============================================
     Files          2170     2170              
     Lines        116665   116712      +47     
     Branches      17656    17666      +10     
   ============================================
   - Hits          81942    81937       -5     
   - Misses        29014    29065      +51     
   - Partials       5709     5710       +1     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | integration1 | `24.01% <40.00%> (-0.01%)` | :arrow_down: |
   | integration2 | `23.79% <28.88%> (+0.19%)` | :arrow_up: |
   | unittests1 | `67.75% <57.77%> (+<0.01%)` | :arrow_up: |
   | unittests2 | `13.61% <0.00%> (+<0.01%)` | :arrow_up: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://app.codecov.io/gh/apache/pinot/pull/10847?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | Coverage Δ | |
   |---|---|---|
   | [.../core/segment/processing/mapper/SegmentMapper.java](https://app.codecov.io/gh/apache/pinot/pull/10847?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9zZWdtZW50L3Byb2Nlc3NpbmcvbWFwcGVyL1NlZ21lbnRNYXBwZXIuamF2YQ==) | `82.85% <16.66%> (-6.21%)` | :arrow_down: |
   | [.../core/segment/processing/reducer/DedupReducer.java](https://app.codecov.io/gh/apache/pinot/pull/10847?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9zZWdtZW50L3Byb2Nlc3NpbmcvcmVkdWNlci9EZWR1cFJlZHVjZXIuamF2YQ==) | `90.47% <55.55%> (-9.53%)` | :arrow_down: |
   | [...core/segment/processing/reducer/RollupReducer.java](https://app.codecov.io/gh/apache/pinot/pull/10847?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9zZWdtZW50L3Byb2Nlc3NpbmcvcmVkdWNlci9Sb2xsdXBSZWR1Y2VyLmphdmE=) | `95.40% <55.55%> (-4.60%)` | :arrow_down: |
   | [...rocessing/framework/SegmentProcessorFramework.java](https://app.codecov.io/gh/apache/pinot/pull/10847?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9zZWdtZW50L3Byb2Nlc3NpbmcvZnJhbWV3b3JrL1NlZ21lbnRQcm9jZXNzb3JGcmFtZXdvcmsuamF2YQ==) | `91.66% <71.42%> (-7.02%)` | :arrow_down: |
   
   ... and [35 files with indirect coverage changes](https://app.codecov.io/gh/apache/pinot/pull/10847/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
   
   :mega: We’re building smart automated test selection to slash your CI/CD build times. [Learn more](https://about.codecov.io/iterative-testing/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] klsince commented on a diff in pull request #10847: clean up output files upon exceptions more properly for SegmentProcessorFramework

Posted by "klsince (via GitHub)" <gi...@apache.org>.
klsince commented on code in PR #10847:
URL: https://github.com/apache/pinot/pull/10847#discussion_r1218737635


##########
pinot-core/src/main/java/org/apache/pinot/core/segment/processing/framework/SegmentProcessorFramework.java:
##########
@@ -91,33 +92,51 @@ public SegmentProcessorFramework(List<RecordReader> recordReaders, SegmentProces
    */
   public List<File> process()
       throws Exception {
+    try {
+      return doProcess();
+    } catch (Exception e) {
+      if (_partitionToFileManagerMap != null) {

Review Comment:
   good point. think we should, as that's resources created by this processor.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] gortiz commented on a diff in pull request #10847: clean up output files upon exceptions more properly for SegmentProcessorFramework

Posted by "gortiz (via GitHub)" <gi...@apache.org>.
gortiz commented on code in PR #10847:
URL: https://github.com/apache/pinot/pull/10847#discussion_r1218737763


##########
pinot-core/src/main/java/org/apache/pinot/core/segment/processing/framework/SegmentProcessorFramework.java:
##########
@@ -91,33 +92,51 @@ public SegmentProcessorFramework(List<RecordReader> recordReaders, SegmentProces
    */
   public List<File> process()
       throws Exception {
+    try {
+      return doProcess();
+    } catch (Exception e) {
+      if (_partitionToFileManagerMap != null) {
+        for (GenericRowFileManager fileManager : _partitionToFileManagerMap.values()) {
+          fileManager.cleanUp();
+        }
+      }
+      throw new RuntimeException("Failed to complete process", e);

Review Comment:
   I don't know. I would prefer to get the specific exception why that fail instead of a generic runtime exception and having to check the cause. But it is not a blocker, if you think that it is better to throw the runtime, it is ok to me.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] swaminathanmanish commented on a diff in pull request #10847: clean up output files upon exceptions more properly for SegmentProcessorFramework

Posted by "swaminathanmanish (via GitHub)" <gi...@apache.org>.
swaminathanmanish commented on code in PR #10847:
URL: https://github.com/apache/pinot/pull/10847#discussion_r1218686650


##########
pinot-core/src/main/java/org/apache/pinot/core/segment/processing/framework/SegmentProcessorFramework.java:
##########
@@ -91,33 +92,51 @@ public SegmentProcessorFramework(List<RecordReader> recordReaders, SegmentProces
    */
   public List<File> process()
       throws Exception {
+    try {
+      return doProcess();
+    } catch (Exception e) {
+      if (_partitionToFileManagerMap != null) {

Review Comment:
   Should we clean up _segmentsOutputDir as well ? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] Jackie-Jiang merged pull request #10847: clean up output files upon exceptions more properly for SegmentProcessorFramework

Posted by "Jackie-Jiang (via GitHub)" <gi...@apache.org>.
Jackie-Jiang merged PR #10847:
URL: https://github.com/apache/pinot/pull/10847


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] klsince commented on a diff in pull request #10847: clean up output files upon exceptions more properly for SegmentProcessorFramework

Posted by "klsince (via GitHub)" <gi...@apache.org>.
klsince commented on code in PR #10847:
URL: https://github.com/apache/pinot/pull/10847#discussion_r1218737175


##########
pinot-core/src/main/java/org/apache/pinot/core/segment/processing/framework/SegmentProcessorFramework.java:
##########
@@ -91,33 +92,51 @@ public SegmentProcessorFramework(List<RecordReader> recordReaders, SegmentProces
    */
   public List<File> process()
       throws Exception {
+    try {
+      return doProcess();
+    } catch (Exception e) {
+      if (_partitionToFileManagerMap != null) {
+        for (GenericRowFileManager fileManager : _partitionToFileManagerMap.values()) {
+          fileManager.cleanUp();
+        }
+      }
+      throw new RuntimeException("Failed to complete process", e);

Review Comment:
   I was thinking to add a bit more context before bubbling it up



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] gortiz commented on a diff in pull request #10847: clean up output files upon exceptions more properly for SegmentProcessorFramework

Posted by "gortiz (via GitHub)" <gi...@apache.org>.
gortiz commented on code in PR #10847:
URL: https://github.com/apache/pinot/pull/10847#discussion_r1218682502


##########
pinot-core/src/main/java/org/apache/pinot/core/segment/processing/framework/SegmentProcessorFramework.java:
##########
@@ -91,33 +92,51 @@ public SegmentProcessorFramework(List<RecordReader> recordReaders, SegmentProces
    */
   public List<File> process()
       throws Exception {
+    try {
+      return doProcess();
+    } catch (Exception e) {
+      if (_partitionToFileManagerMap != null) {
+        for (GenericRowFileManager fileManager : _partitionToFileManagerMap.values()) {
+          fileManager.cleanUp();
+        }
+      }
+      throw new RuntimeException("Failed to complete process", e);

Review Comment:
   Same in `SegmentMapper`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] klsince commented on a diff in pull request #10847: clean up output files upon exceptions more properly for SegmentProcessorFramework

Posted by "klsince (via GitHub)" <gi...@apache.org>.
klsince commented on code in PR #10847:
URL: https://github.com/apache/pinot/pull/10847#discussion_r1218884702


##########
pinot-core/src/main/java/org/apache/pinot/core/segment/processing/framework/SegmentProcessorFramework.java:
##########
@@ -91,33 +92,51 @@ public SegmentProcessorFramework(List<RecordReader> recordReaders, SegmentProces
    */
   public List<File> process()
       throws Exception {
+    try {
+      return doProcess();
+    } catch (Exception e) {
+      if (_partitionToFileManagerMap != null) {
+        for (GenericRowFileManager fileManager : _partitionToFileManagerMap.values()) {
+          fileManager.cleanUp();
+        }
+      }
+      throw new RuntimeException("Failed to complete process", e);

Review Comment:
   updated to throw e directly, as the extra context msg passed to RuntimeException didn't carry much info anway. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org