You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Tilman Hausherr (JIRA)" <ji...@apache.org> on 2019/01/31 20:54:00 UTC

[jira] [Comment Edited] (PDFBOX-4450) java.lang.OutOfMemoryError when validating pdf

    [ https://issues.apache.org/jira/browse/PDFBOX-4450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16757698#comment-16757698 ] 

Tilman Hausherr edited comment on PDFBOX-4450 at 1/31/19 8:53 PM:
------------------------------------------------------------------

pages 136 has a loop:
{code:java}
Root/Pages/Kids/[135]/Resources/XObject/Fm0/Resources/XObject/R1579/Resources/XObject/R1579/Resources/XObject/R1579{code}
There is no simple solution for this… For your file, the problem will likely go away by using
{code:java}
document.getContext().getConfig().setMaxErrors(100){code}
Because the first few pages will produce enough errors.

I also tried with VeraPDF, but that one aborted with an NPE, see linked issue.


was (Author: tilman):
pages 136 has a loop:
{code:java}
Root/Pages/Kids/[135]/Resources/XObject/Fm0/Resources/XObject/R1579/Resources/XObject/R1579/Resources/XObject/R1579{code}
There is no simple solution for this… For your file, the problem will likely go away by using
{code:java}
document.getContext().getConfig().setMaxErrors(100){code}
Because the first few pages will produce enough errors.

I also tried with VeraPDF, but that one aborted with an NPE. I'll retest with the current version.

> java.lang.OutOfMemoryError when validating pdf 
> -----------------------------------------------
>
>                 Key: PDFBOX-4450
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-4450
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Preflight
>    Affects Versions: 2.0.13
>            Reporter: Dana Shaw
>            Priority: Major
>         Attachments: lean-from-the-trenches.pdf
>
>
> Getting an out of memory exception when attempting to use preflight to validate pdfs.
>  
> Env:
> Linux 64 bit (arch linux)
> Java 8
> java -version
>  java version "1.8.0_131"
>  Java(TM) SE Runtime Environment (build 1.8.0_131-b11)
>  Java HotSpot(TM) 64-Bit Server VM (build 25.131-b11, mixed mode)
> JVM args used to test: 
> java -Xmx2048m -Dsun.java2d.cmm=sun.java2d.cmm.kcms.KcmsServiceProvider
>  
> PDF that is blowing up 
> [^lean-from-the-trenches.pdf]
>  
> {code:java}
> Console output
> Jan 30, 2019 10:25:58 AM org.apache.pdfbox.pdmodel.font.PDType1Font <init>
> WARNING: Using fallback font ArialMT for base font Symbol
> Jan 30, 2019 10:25:58 AM org.apache.pdfbox.pdmodel.font.PDType1Font <init>
> WARNING: Using fallback font ArialMT for base font ZapfDingbats
> Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded
> at java.util.Arrays.copyOfRange(Arrays.java:3664)
> at java.lang.String.<init>(String.java:207)
> at java.lang.StringBuilder.toString(StringBuilder.java:407)
> at org.apache.pdfbox.cos.COSDictionary.getDictionaryString(COSDictionary.java:1587)
> at org.apache.pdfbox.cos.COSDictionary.getDictionaryString(COSDictionary.java:1559)
> at org.apache.pdfbox.cos.COSDictionary.getDictionaryString(COSDictionary.java:1559)
> at org.apache.pdfbox.cos.COSDictionary.getDictionaryString(COSDictionary.java:1559)
> at org.apache.pdfbox.cos.COSDictionary.getDictionaryString(COSDictionary.java:1587)
> at org.apache.pdfbox.cos.COSDictionary.getDictionaryString(COSDictionary.java:1559)
> at org.apache.pdfbox.cos.COSDictionary.getDictionaryString(COSDictionary.java:1559)
> at org.apache.pdfbox.cos.COSDictionary.getDictionaryString(COSDictionary.java:1559)
> at org.apache.pdfbox.cos.COSDictionary.getDictionaryString(COSDictionary.java:1587)
> at org.apache.pdfbox.cos.COSDictionary.getDictionaryString(COSDictionary.java:1559)
> at org.apache.pdfbox.cos.COSDictionary.getDictionaryString(COSDictionary.java:1559)
> at org.apache.pdfbox.cos.COSDictionary.getDictionaryString(COSDictionary.java:1559)
> at org.apache.pdfbox.cos.COSDictionary.getDictionaryString(COSDictionary.java:1587)
> at org.apache.pdfbox.cos.COSDictionary.getDictionaryString(COSDictionary.java:1559)
> at org.apache.pdfbox.cos.COSDictionary.getDictionaryString(COSDictionary.java:1559)
> at org.apache.pdfbox.cos.COSDictionary.getDictionaryString(COSDictionary.java:1559)
> at org.apache.pdfbox.cos.COSDictionary.getDictionaryString(COSDictionary.java:1587)
> at org.apache.pdfbox.cos.COSDictionary.getDictionaryString(COSDictionary.java:1559)
> at org.apache.pdfbox.cos.COSDictionary.getDictionaryString(COSDictionary.java:1559)
> at org.apache.pdfbox.cos.COSDictionary.getDictionaryString(COSDictionary.java:1559)
> at org.apache.pdfbox.cos.COSDictionary.toString(COSDictionary.java:1531)
> at org.apache.pdfbox.preflight.xobject.XObjFormValidator.checkGroup(XObjFormValidator.java:138)
> at org.apache.pdfbox.preflight.xobject.XObjFormValidator.validate(XObjFormValidator.java:73)
> at org.apache.pdfbox.preflight.process.reflect.GraphicObjectPageValidationProcess.validate(GraphicObjectPageValidationProcess.java:74)
> at org.apache.pdfbox.preflight.utils.ContextHelper.callValidation(ContextHelper.java:84)
> at org.apache.pdfbox.preflight.utils.ContextHelper.validateElement(ContextHelper.java:57)
> at org.apache.pdfbox.preflight.process.reflect.ResourcesValidationProcess.validateXObjects(ResourcesValidationProcess.java:224)
> at org.apache.pdfbox.preflight.process.reflect.ResourcesValidationProcess.validate(ResourcesValidationProcess.java:81)
> at org.apache.pdfbox.preflight.utils.ContextHelper.callValidation(ContextHelper.java:84){code}
>  
> Code used:
>  
> {code:java}
> import java.io.File;
> import java.util.ArrayList;
> import java.util.List;
> import org.apache.pdfbox.preflight.PreflightDocument;
> import org.apache.pdfbox.preflight.ValidationResult;
> import org.apache.pdfbox.preflight.ValidationResult.ValidationError;
> import org.apache.pdfbox.preflight.parser.PreflightParser;
> public class Validator {
>   private File file = null;
>   private List<ValidationError> errorList = new ArrayList<ValidationError>();
>   public Validator(File file) {
>     this.file = file;
>   }
>   public List<ValidationError> getErrors(){
>     return errorList;
>   }
>   public boolean validate() throws Exception{
>     PreflightParser parser = null;
>     PreflightDocument document = null;
>     ValidationResult result = null;
>     try {
>       parser = new PreflightParser(file);
>       parser.parse();
>       document = parser.getPreflightDocument();
>       document.validate();
>       result = document.getResult();
>       errorList = result.getErrorsList();
>     }
>     catch(Exception e) {
>       throw e;
>     }
>     finally {
>       if(document != null) {
>         try {
>           document.close();
>         }catch(Exception ignored) {}
>       }
>       parser = null;
>       document = null;
>       result = null;
>     }
>     return errorList.size() > 0 ? true : false;
>   }
> }
> {code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org