You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "Vikas Saurabh (JIRA)" <ji...@apache.org> on 2017/12/12 21:29:00 UTC

[jira] [Comment Edited] (OAK-7052) Active deletion purge can OOM if number of blobs listed in a file become too large

    [ https://issues.apache.org/jira/browse/OAK-7052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16288299#comment-16288299 ] 

Vikas Saurabh edited comment on OAK-7052 at 12/12/17 9:28 PM:
--------------------------------------------------------------

[~tmueller],
I think this patch should do the trick - but, I can't think of a way to test it (I mean short of looking at mem usage during active deletion purge cycle)
{noformat}
diff --git a/oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/directory/ActiveDeletedBlobCollectorFactory.java b/oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/directory/ActiveDeletedBlobCollectorFactory.java
index ec7afefb96..01d8c52f47 100644
--- a/oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/directory/ActiveDeletedBlobCollectorFactory.java
+++ b/oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/directory/ActiveDeletedBlobCollectorFactory.java
@@ -22,6 +22,7 @@ import com.google.common.collect.Lists;
 import com.google.common.io.Closeables;
 import com.google.common.io.Files;
 import org.apache.commons.io.FileUtils;
+import org.apache.commons.io.LineIterator;
 import org.apache.commons.io.filefilter.IOFileFilter;
 import org.apache.commons.io.filefilter.RegexFileFilter;
 import org.apache.jackrabbit.core.data.DataStoreException;
@@ -218,11 +219,14 @@ public class ActiveDeletedBlobCollectorFactory {
                     continue;
                 }
                 if (timestamp < before) {
+                    LineIterator blobLineIter = null;
                     try {
-                        for (String deletedBlobLine : FileUtils.readLines(deletedBlobListFile, (String) null)) {
+                        blobLineIter = FileUtils.lineIterator(deletedBlobListFile);
+                        while (blobLineIter.hasNext()) {
                             if (cancelled) {
                                 break;
                             }
+                            String deletedBlobLine = blobLineIter.next();
 
                             String[] parsedDeletedBlobIdLine = deletedBlobLine.split("\\|", 3);
                             if (parsedDeletedBlobIdLine.length != 3) {
@@ -274,6 +278,7 @@ public class ActiveDeletedBlobCollectorFactory {
                     } catch (IOException ioe) {
                         //log error and continue
                         LOG.warn("Couldn't read deleted blob list file - " + deletedBlobListFile, ioe);
+                        LineIterator.closeQuietly(blobLineIter);
                     }
 
                     // OAK-6314 revealed that blobs appended might not be immediately available. So, we'd skip
{noformat}


was (Author: catholicon):
[~tmueller],
I think this patch should do the trick - but, I can't think of a way to test it (I mean short of looking at mem usage during active deletion purge cycle)
{noformat}
[vsaurabh@durden]jackrabbit-oak (OAK-7052-ActiveDeletePurgeOOM) $ git diff
diff --git a/oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/directory/ActiveDeletedBlobCollectorFactory.java b/oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/directory/ActiveDeletedBlobCollectorFactory.java
index ec7afefb96..01d8c52f47 100644
--- a/oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/directory/ActiveDeletedBlobCollectorFactory.java
+++ b/oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/directory/ActiveDeletedBlobCollectorFactory.java
@@ -22,6 +22,7 @@ import com.google.common.collect.Lists;
 import com.google.common.io.Closeables;
 import com.google.common.io.Files;
 import org.apache.commons.io.FileUtils;
+import org.apache.commons.io.LineIterator;
 import org.apache.commons.io.filefilter.IOFileFilter;
 import org.apache.commons.io.filefilter.RegexFileFilter;
 import org.apache.jackrabbit.core.data.DataStoreException;
@@ -218,11 +219,14 @@ public class ActiveDeletedBlobCollectorFactory {
                     continue;
                 }
                 if (timestamp < before) {
+                    LineIterator blobLineIter = null;
                     try {
-                        for (String deletedBlobLine : FileUtils.readLines(deletedBlobListFile, (String) null)) {
+                        blobLineIter = FileUtils.lineIterator(deletedBlobListFile);
+                        while (blobLineIter.hasNext()) {
                             if (cancelled) {
                                 break;
                             }
+                            String deletedBlobLine = blobLineIter.next();
 
                             String[] parsedDeletedBlobIdLine = deletedBlobLine.split("\\|", 3);
                             if (parsedDeletedBlobIdLine.length != 3) {
@@ -274,6 +278,7 @@ public class ActiveDeletedBlobCollectorFactory {
                     } catch (IOException ioe) {
                         //log error and continue
                         LOG.warn("Couldn't read deleted blob list file - " + deletedBlobListFile, ioe);
+                        LineIterator.closeQuietly(blobLineIter);
                     }
 
                     // OAK-6314 revealed that blobs appended might not be immediately available. So, we'd skip
{noformat}

> Active deletion purge can OOM if number of blobs listed in a file become too large
> ----------------------------------------------------------------------------------
>
>                 Key: OAK-7052
>                 URL: https://issues.apache.org/jira/browse/OAK-7052
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: lucene
>            Reporter: Vikas Saurabh
>            Assignee: Vikas Saurabh
>             Fix For: 1.8
>
>
> Actively deletion, while purging, reads blob ids from a file using {{FileUtils#readLines}}. That method returns a list, while we only require to iterate over the file contents line-by-line.
> We should probably switch to {{FileUtils#lineIterator}} to clearly declare our intent to read line by line.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)