You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by Sridhar So <sr...@tcs.com> on 2015/11/12 13:53:38 UTC

Speedup Font Cache: Performance Issue in PDFBox 2.0.0-RC1

Dear PDFBox Developers/Contributors

I am unable to subscribe to users mailing list as the link tries to open Outlook not the page to subscribe, hence a seperate mail on similar/same issue discussed.

Issue:
------- 
PDFBox2.0.0-RC1 is very slow in printinng ( taking 35 to 50 seconds )  as it tries to load fonts each time with the following message

WARNING: New fonts found, font cache will be re-built
Nov 12, 2015 3:17:26 PM org.apache.pdfbox.pdmodel.font.FileSystemFontProvider <init>
WARNING: Building font cache, this may take a while
Nov 12, 2015 3:17:32 PM org.apache.pdfbox.pdmodel.font.FileSystemFontProvider saveCache
WARNING: Finished building font cache, found 522 fonts


Is the fix or patch available to avoid slow performance due to above ( rebuilding font cache  every tme ) ? 
If the fix not available in 2.0.0-RC1, Is there any way to fix alignment issue in PDFBox 1.8.10? as 1.8.10 left margin is too low and first few characters are found cut in printout.

With PDFBox1.8.10, there is no performance issue, but alignment in prontout is not proper.   
With PDFBox 2.0.0-RC1, we are facing performance issue.

PDFDocument used has Ariel  Unicode or True Type Fonts. 

Similar discussion thread is pasted below, as I was unable to reply to same discussion thread, nor able to subscribe to users mailing list, hence a seperate mail.

Regards
Sridhar

Subject:	Re: Speedup Font Cache	
From:	John Hewson (jo...@jahewson.com)
Date:	Oct 21, 2015 5:26:41 pm
List:	org.apache.pdfbox.users

On 21 Oct 2015, at 09:43, Maruan Sahyoun <sa...@fileaffairs.de> wrote:

Hi,

Am 21.10.2015 um 18:40 schrieb Tilman Hausherr <TH...@t-online.de>:

Am 21.10.2015 um 14:10 schrieb Roberto Nibali:
Hi John

On Wed, Oct 21, 2015 at 12:35 AM, John Hewson <jo...@jahewson.com> wrote:

Yes, I&#8217;m able to replicate that issue on Windows. It&#8217;s apparently related
to administrator ownership of that registry key&#8217;s parent node. Looks like
it&#8217;ll be necessary to log in as admin and create that key with user access.
I guess that&#8217;s far from ideal?

The whole issue also happens on MacOSX. When you introduce this on-disk
cache a couple of months back, it worked fine, however one of the recent
changes to SVN must have wrecked the initially intended functionality. Not
only is the font caching setup 5-10 times as long as it used to be, it also
does not seem to persist it anymore. Version used:

$ svn info | grep -i changed
Last Changed Author: tilman
Last Changed Rev: 1709647
Last Changed Date: 2015-10-20 19:04:02 +0200 (Tue, 20 Oct 2015)

Running my test tool indicates:

Oct 21, 2015 2:08:29 PM
org.apache.pdfbox.pdmodel.font.FileSystemFontProvider loadCache
WARNING: New fonts found, font cache will be re-built
Oct 21, 2015 2:08:29 PM
org.apache.pdfbox.pdmodel.font.FileSystemFontProvider <init>
WARNING: Building font cache, this may take a while
Oct 21, 2015 2:08:39 PM
org.apache.pdfbox.pdmodel.font.FileSystemFontProvider saveCache
WARNING: Finished building font cache, found 654 fonts
[INFO, ctx=./ccalt.pdf]: Opening Source ./ccalt.pdf
[INFO, ctx=./ccalt.pdf]: Opening Template ./cctemp.pdf
[INFO, ctx=./ccalt.pdf]: Writing Output ./ccmig.pdf
[INFO, ctx=./ccalt.pdf]: Completed in 15037.02ms

This used to be anything between 1200ms and 2300ms and once it was
persisted onto disk, it was rather fast in subsequent calls. Unfortunately,
SVN does not provide the handy tool of "git bisect" to quickly find out
which change actually caused this regression.

There were only 4 changes since then, so it might be worth a try to just revert
that file.

(I can't help; for me, it has always been slow.)

Could it be that 1) you installed new stuff on your computer, 2) that MacOS has
many of its fonts in .ttc files? In Windows there are only 10.

on my OS X I have 92 ttc files (out of 384) :-)

Yep, OS X uses ttc much more heavily than Windows and some of those are big
Asian fonts which PDFBox parses relatively slowly.

&#8212; John

BR
Maruan

Tilman

Let me know if you need any further input.

Cheers
Roberto



Regards
Sridhar Sowmiyanarayanan
Tata Consultancy Services
Website: http://www.tcs.com
____________________________________________
Experience certainty.	IT Services
Business Solutions
Consulting
____________________________________________
=====-----=====-----=====
Notice: The information contained in this e-mail
message and/or attachments to it may contain 
confidential or privileged information. If you are 
not the intended recipient, any dissemination, use, 
review, distribution, printing or copying of the 
information contained in this e-mail message 
and/or attachments to it are strictly prohibited. If 
you have received this communication in error, 
please notify us by reply e-mail or telephone and 
immediately and permanently delete the message 
and any attachments. Thank you



Re: Speedup Font Cache: Performance Issue in PDFBox 2.0.0-RC1

Posted by Tilman Hausherr <TH...@t-online.de>.
You can get the snapshot either through maven, or here:
https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/pdfbox-app/2.0.0-SNAPSHOT/

Btw there is no need that everybody from the team posts the same question.

Tilman



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Re: Speedup Font Cache: Performance Issue in PDFBox 2.0.0-RC1

Posted by Maruan Sahyoun <sa...@fileaffairs.de>.
Hi,

> Am 12.11.2015 um 13:53 schrieb Sridhar So <sr...@tcs.com>:
> 
> Dear PDFBox Developers/Contributors
> 
> I am unable to subscribe to users mailing list as the link tries to open Outlook not the page to subscribe, hence a seperate mail on similar/same issue discussed.
> 
> Issue:
> ------- 
> PDFBox2.0.0-RC1 is very slow in printinng ( taking 35 to 50 seconds )  as it tries to load fonts each time with the following message
> 
> WARNING: New fonts found, font cache will be re-built
> Nov 12, 2015 3:17:26 PM org.apache.pdfbox.pdmodel.font.FileSystemFontProvider <init>
> WARNING: Building font cache, this may take a while
> Nov 12, 2015 3:17:32 PM org.apache.pdfbox.pdmodel.font.FileSystemFontProvider saveCache
> WARNING: Finished building font cache, found 522 fonts

that has been fixed after RC1 so please try with the latest snapshot build http://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/ <http://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/>

A 2nd Release Candidate should also be available soon.

BR
Maruan

> 
> 
> Is the fix or patch available to avoid slow performance due to above ( rebuilding font cache  every tme ) ? 
> If the fix not available in 2.0.0-RC1, Is there any way to fix alignment issue in PDFBox 1.8.10? as 1.8.10 left margin is too low and first few characters are found cut in printout.
> 
> With PDFBox1.8.10, there is no performance issue, but alignment in prontout is not proper.   
> With PDFBox 2.0.0-RC1, we are facing performance issue.
> 
> PDFDocument used has Ariel  Unicode or True Type Fonts. 
> 
> Similar discussion thread is pasted below, as I was unable to reply to same discussion thread, nor able to subscribe to users mailing list, hence a seperate mail.
> 
> Regards
> Sridhar
> 
> Subject:	Re: Speedup Font Cache	
> From:	John Hewson (jo...@jahewson.com)
> Date:	Oct 21, 2015 5:26:41 pm
> List:	org.apache.pdfbox.users
> 
> On 21 Oct 2015, at 09:43, Maruan Sahyoun <sa...@fileaffairs.de> wrote:
> 
> Hi,
> 
> Am 21.10.2015 um 18:40 schrieb Tilman Hausherr <TH...@t-online.de>:
> 
> Am 21.10.2015 um 14:10 schrieb Roberto Nibali:
> Hi John
> 
> On Wed, Oct 21, 2015 at 12:35 AM, John Hewson <jo...@jahewson.com> wrote:
> 
> Yes, I&#8217;m able to replicate that issue on Windows. It&#8217;s apparently related
> to administrator ownership of that registry key&#8217;s parent node. Looks like
> it&#8217;ll be necessary to log in as admin and create that key with user access.
> I guess that&#8217;s far from ideal?
> 
> The whole issue also happens on MacOSX. When you introduce this on-disk
> cache a couple of months back, it worked fine, however one of the recent
> changes to SVN must have wrecked the initially intended functionality. Not
> only is the font caching setup 5-10 times as long as it used to be, it also
> does not seem to persist it anymore. Version used:
> 
> $ svn info | grep -i changed
> Last Changed Author: tilman
> Last Changed Rev: 1709647
> Last Changed Date: 2015-10-20 19:04:02 +0200 (Tue, 20 Oct 2015)
> 
> Running my test tool indicates:
> 
> Oct 21, 2015 2:08:29 PM
> org.apache.pdfbox.pdmodel.font.FileSystemFontProvider loadCache
> WARNING: New fonts found, font cache will be re-built
> Oct 21, 2015 2:08:29 PM
> org.apache.pdfbox.pdmodel.font.FileSystemFontProvider <init>
> WARNING: Building font cache, this may take a while
> Oct 21, 2015 2:08:39 PM
> org.apache.pdfbox.pdmodel.font.FileSystemFontProvider saveCache
> WARNING: Finished building font cache, found 654 fonts
> [INFO, ctx=./ccalt.pdf]: Opening Source ./ccalt.pdf
> [INFO, ctx=./ccalt.pdf]: Opening Template ./cctemp.pdf
> [INFO, ctx=./ccalt.pdf]: Writing Output ./ccmig.pdf
> [INFO, ctx=./ccalt.pdf]: Completed in 15037.02ms
> 
> This used to be anything between 1200ms and 2300ms and once it was
> persisted onto disk, it was rather fast in subsequent calls. Unfortunately,
> SVN does not provide the handy tool of "git bisect" to quickly find out
> which change actually caused this regression.
> 
> There were only 4 changes since then, so it might be worth a try to just revert
> that file.
> 
> (I can't help; for me, it has always been slow.)
> 
> Could it be that 1) you installed new stuff on your computer, 2) that MacOS has
> many of its fonts in .ttc files? In Windows there are only 10.
> 
> on my OS X I have 92 ttc files (out of 384) :-)
> 
> Yep, OS X uses ttc much more heavily than Windows and some of those are big
> Asian fonts which PDFBox parses relatively slowly.
> 
> &#8212; John
> 
> BR
> Maruan
> 
> Tilman
> 
> Let me know if you need any further input.
> 
> Cheers
> Roberto
> 
> 
> 
> Regards
> Sridhar Sowmiyanarayanan
> Tata Consultancy Services
> Website: http://www.tcs.com
> ____________________________________________
> Experience certainty.	IT Services
> Business Solutions
> Consulting
> ____________________________________________
> =====-----=====-----=====
> Notice: The information contained in this e-mail
> message and/or attachments to it may contain 
> confidential or privileged information. If you are 
> not the intended recipient, any dissemination, use, 
> review, distribution, printing or copying of the 
> information contained in this e-mail message 
> and/or attachments to it are strictly prohibited. If 
> you have received this communication in error, 
> please notify us by reply e-mail or telephone and 
> immediately and permanently delete the message 
> and any attachments. Thank you
> 
>