You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by Kim Hagedorn <ki...@andrena.de.INVALID> on 2024/03/07 13:08:47 UTC

OutOfMemoryException in FileSystemFontProvider (pdfbox v2.0.30)

Hello
 
 
I originally wanted to submit a defect to the PdfBox issue tracker but was redirected to this list, so here we go…
 
We experienced an OutOfMemoryError when calling
PDAcroForm.getDefaultResources().getFont(COSName); with COSName{Helv}
 
at this location:
 
main
  at java.lang.OutOfMemoryError.<init>()V (OutOfMemoryError.java:48)
  at java.util.Arrays.copyOf([BI)[B (Arrays.java:3537)
  at java.io.ByteArrayOutputStream.ensureCapacity(I)V (ByteArrayOutputStream.java:100)
  at java.io.ByteArrayOutputStream.write([BII)V (ByteArrayOutputStream.java:130)
  at org.apache.pdfbox.io.IOUtils.copy(Ljava/io/InputStream;Ljava/io/OutputStream;)J (IOUtils.java:70)
  at org.apache.pdfbox.io.IOUtils.toByteArray(Ljava/io/InputStream;)[B (IOUtils.java:52)
  at org.apache.pdfbox.pdmodel.font.FileSystemFontProvider.addTrueTypeFontImpl(Lorg/apache/fontbox/ttf/TrueTypeFont;Ljava/io/File;)V (FileSystemFontProvider.java:773)
  at org.apache.pdfbox.pdmodel.font.FileSystemFontProvider.access$1400(Lorg/apache/pdfbox/pdmodel/font/FileSystemFontProvider;Lorg/apache/fontbox/ttf/TrueTypeFont;Ljava/io/File;)V (FileSystemFontProvider.java:60)
  at org.apache.pdfbox.pdmodel.font.FileSystemFontProvider$1.process(Lorg/apache/fontbox/ttf/TrueTypeFont;)V (FileSystemFontProvider.java:686)
  at org.apache.fontbox.ttf.TrueTypeCollection.processAllFonts(Lorg/apache/fontbox/ttf/TrueTypeCollection$TrueTypeFontProcessor;)V (TrueTypeCollection.java:106)
  at org.apache.pdfbox.pdmodel.font.FileSystemFontProvider.addTrueTypeCollection(Ljava/io/File;)V (FileSystemFontProvider.java:681)
  at org.apache.pdfbox.pdmodel.font.FileSystemFontProvider.scanFonts(Ljava/util/List;)V (FileSystemFontProvider.java:398)
  at org.apache.pdfbox.pdmodel.font.FileSystemFontProvider.<init>(Lorg/apache/pdfbox/pdmodel/font/FontCache;)V (FileSystemFontProvider.java:372)
  at org.apache.pdfbox.pdmodel.font.FontMapperImpl$DefaultFontProvider.<clinit>()V (FontMapperImpl.java:141)
  at org.apache.pdfbox.pdmodel.font.FontMapperImpl.getProvider()Lorg/apache/pdfbox/pdmodel/font/FontProvider; (FontMapperImpl.java:160)
  at org.apache.pdfbox.pdmodel.font.FontMapperImpl.findFont(Lorg/apache/pdfbox/pdmodel/font/FontFormat;Ljava/lang/String;)Lorg/apache/fontbox/FontBoxFont; (FontMapperImpl.java:430)
  at org.apache.pdfbox.pdmodel.font.FontMapperImpl.findFontBoxFont(Ljava/lang/String;)Lorg/apache/fontbox/FontBoxFont; (FontMapperImpl.java:393)
  at org.apache.pdfbox.pdmodel.font.FontMapperImpl.getFontBoxFont(Ljava/lang/String;Lorg/apache/pdfbox/pdmodel/font/PDFontDescriptor;)Lorg/apache/pdfbox/pdmodel/font/FontMapping; (FontMapperImpl.java:367)
  at de. <...> .getFontBoxFont(Ljava/lang/String;Lorg/apache/pdfbox/pdmodel/font/PDFontDescriptor;)Lorg/apache/pdfbox/pdmodel/font/FontMapping; (PdfFontManager.java:152)
  at org.apache.pdfbox.pdmodel.font.PDType1Font.<init>(Ljava/lang/String;)V (PDType1Font.java:146)
  at org.apache.pdfbox.pdmodel.font.PDType1Font.<clinit>()V (PDType1Font.java:79)
  at org.apache.pdfbox.pdmodel.font.PDFontFactory.createFont(Lorg/apache/pdfbox/cos/COSDictionary;Lorg/apache/pdfbox/pdmodel/ResourceCache;)Lorg/apache/pdfbox/pdmodel/font/PDFont; (PDFontFactory.java:76)
  at org.apache.pdfbox.pdmodel.PDResources.getFont(Lorg/apache/pdfbox/cos/COSName;)Lorg/apache/pdfbox/pdmodel/font/PDFont; (PDResources.java:171)
  at de. <...> .initializeFonts()V (PdfFontHelper.java:66)
 
 
The reason seemed to be that PdfBox initializes a FontCache when getFont is called and this scans _all_ fonts on the system. This also loads some large system fonts (AppleColorEmoji is 189,9MB). Each font gets copied into a single large byte array at the location below and this causes an OutOfMemoryError at this point in the code.
 
org.apache.pdfbox.pdmodel.font.FileSystemFontProvider#addTrueTypeFontImpl:773
> InputStream is = ttf.getOriginalData();
> byte[] ba = IOUtils.toByteArray(is);
> is.close();
> String hash = computeHash(ba);

 
I think this would be easily fixed by using a DigestInputStream instead of a byte array to compute hashes at this location. I have tested this locally and it seemed to work. I could send a patch file or submit a pull request, if it helps.
 
 
Best regards
 
 
Kim Hagedorn
 
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Re: OutOfMemoryException in FileSystemFontProvider (pdfbox v2.0.30)

Posted by Tilman Hausherr <TH...@t-online.de>.
Hello Kim,
You're welcome, please open a ticket and include your proposed solution. 
I have approved your registration. (I initially denied it because your 
text had no details whatsover)
Tilman

On 07.03.2024 14:08, Kim Hagedorn wrote:
> Hello
>   
>   
> I originally wanted to submit a defect to the PdfBox issue tracker but was redirected to this list, so here we go…
>   
> We experienced an OutOfMemoryError when calling
> PDAcroForm.getDefaultResources().getFont(COSName); with COSName{Helv}
>   
> at this location:
>   
> main
>    at java.lang.OutOfMemoryError.<init>()V (OutOfMemoryError.java:48)
>    at java.util.Arrays.copyOf([BI)[B (Arrays.java:3537)
>    at java.io.ByteArrayOutputStream.ensureCapacity(I)V (ByteArrayOutputStream.java:100)
>    at java.io.ByteArrayOutputStream.write([BII)V (ByteArrayOutputStream.java:130)
>    at org.apache.pdfbox.io.IOUtils.copy(Ljava/io/InputStream;Ljava/io/OutputStream;)J (IOUtils.java:70)
>    at org.apache.pdfbox.io.IOUtils.toByteArray(Ljava/io/InputStream;)[B (IOUtils.java:52)
>    at org.apache.pdfbox.pdmodel.font.FileSystemFontProvider.addTrueTypeFontImpl(Lorg/apache/fontbox/ttf/TrueTypeFont;Ljava/io/File;)V (FileSystemFontProvider.java:773)
>    at org.apache.pdfbox.pdmodel.font.FileSystemFontProvider.access$1400(Lorg/apache/pdfbox/pdmodel/font/FileSystemFontProvider;Lorg/apache/fontbox/ttf/TrueTypeFont;Ljava/io/File;)V (FileSystemFontProvider.java:60)
>    at org.apache.pdfbox.pdmodel.font.FileSystemFontProvider$1.process(Lorg/apache/fontbox/ttf/TrueTypeFont;)V (FileSystemFontProvider.java:686)
>    at org.apache.fontbox.ttf.TrueTypeCollection.processAllFonts(Lorg/apache/fontbox/ttf/TrueTypeCollection$TrueTypeFontProcessor;)V (TrueTypeCollection.java:106)
>    at org.apache.pdfbox.pdmodel.font.FileSystemFontProvider.addTrueTypeCollection(Ljava/io/File;)V (FileSystemFontProvider.java:681)
>    at org.apache.pdfbox.pdmodel.font.FileSystemFontProvider.scanFonts(Ljava/util/List;)V (FileSystemFontProvider.java:398)
>    at org.apache.pdfbox.pdmodel.font.FileSystemFontProvider.<init>(Lorg/apache/pdfbox/pdmodel/font/FontCache;)V (FileSystemFontProvider.java:372)
>    at org.apache.pdfbox.pdmodel.font.FontMapperImpl$DefaultFontProvider.<clinit>()V (FontMapperImpl.java:141)
>    at org.apache.pdfbox.pdmodel.font.FontMapperImpl.getProvider()Lorg/apache/pdfbox/pdmodel/font/FontProvider; (FontMapperImpl.java:160)
>    at org.apache.pdfbox.pdmodel.font.FontMapperImpl.findFont(Lorg/apache/pdfbox/pdmodel/font/FontFormat;Ljava/lang/String;)Lorg/apache/fontbox/FontBoxFont; (FontMapperImpl.java:430)
>    at org.apache.pdfbox.pdmodel.font.FontMapperImpl.findFontBoxFont(Ljava/lang/String;)Lorg/apache/fontbox/FontBoxFont; (FontMapperImpl.java:393)
>    at org.apache.pdfbox.pdmodel.font.FontMapperImpl.getFontBoxFont(Ljava/lang/String;Lorg/apache/pdfbox/pdmodel/font/PDFontDescriptor;)Lorg/apache/pdfbox/pdmodel/font/FontMapping; (FontMapperImpl.java:367)
>    at de. <...> .getFontBoxFont(Ljava/lang/String;Lorg/apache/pdfbox/pdmodel/font/PDFontDescriptor;)Lorg/apache/pdfbox/pdmodel/font/FontMapping; (PdfFontManager.java:152)
>    at org.apache.pdfbox.pdmodel.font.PDType1Font.<init>(Ljava/lang/String;)V (PDType1Font.java:146)
>    at org.apache.pdfbox.pdmodel.font.PDType1Font.<clinit>()V (PDType1Font.java:79)
>    at org.apache.pdfbox.pdmodel.font.PDFontFactory.createFont(Lorg/apache/pdfbox/cos/COSDictionary;Lorg/apache/pdfbox/pdmodel/ResourceCache;)Lorg/apache/pdfbox/pdmodel/font/PDFont; (PDFontFactory.java:76)
>    at org.apache.pdfbox.pdmodel.PDResources.getFont(Lorg/apache/pdfbox/cos/COSName;)Lorg/apache/pdfbox/pdmodel/font/PDFont; (PDResources.java:171)
>    at de. <...> .initializeFonts()V (PdfFontHelper.java:66)
>   
>   
> The reason seemed to be that PdfBox initializes a FontCache when getFont is called and this scans _all_ fonts on the system. This also loads some large system fonts (AppleColorEmoji is 189,9MB). Each font gets copied into a single large byte array at the location below and this causes an OutOfMemoryError at this point in the code.
>   
> org.apache.pdfbox.pdmodel.font.FileSystemFontProvider#addTrueTypeFontImpl:773
>> InputStream is = ttf.getOriginalData();
>> byte[] ba = IOUtils.toByteArray(is);
>> is.close();
>> String hash = computeHash(ba);
>   
> I think this would be easily fixed by using a DigestInputStream instead of a byte array to compute hashes at this location. I have tested this locally and it seemed to work. I could send a patch file or submit a pull request, if it helps.
>   
>   
> Best regards
>   
>   
> Kim Hagedorn
>   
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org