You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Alfred (Jira)" <ji...@apache.org> on 2020/06/11 06:41:00 UTC

[jira] [Created] (PDFBOX-4877) Matrix class performance improvements

Alfred created PDFBOX-4877:
------------------------------

             Summary: Matrix class performance improvements
                 Key: PDFBOX-4877
                 URL: https://issues.apache.org/jira/browse/PDFBOX-4877
             Project: PDFBox
          Issue Type: Improvement
          Components: Parsing, Text extraction
    Affects Versions: 2.0.20, 3.0.0 PDFBox
            Reporter: Alfred


I am testing text extraction from PDF and profiling the execution.

I found that the second biggest time consumer is the static code in Standard14Fonts that loads fonts from the pdf box jar.

Looking at the code I realized we don't have to load all fonts statically, when the class loads.

Not all PDFs need all fonts, so, if we lazy loaded them, only when needed, it will save some time and some memory.

The memory part in particular would be important when running on a tablet or a phone, where the entire memory space of the app is 80M - 160M.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org