You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by Cornelis Hoeflake <c....@postex.com> on 2015/08/24 11:20:10 UTC

Font provider since PDFBOX-2842

Hi,

In the before PDFBOX-2842 situation we set the FontProvider on
ExternalFonts to a thread bound font provider (uses ThreadLocal).
This is done because we have a systemen where multiple customers which have
their own fonts. That fonts could also dynamically added to the system at
runtime. We have implemented the FontProvider so that it looks in the
database for a font request.

In the new situation the Mapper reads the font information once (at
setProvider) and uses this global for the whole system.

How can we create a situation like we had but then with the new code? I do
not see an option.
I think it is a good idea to drop static FontMapper, FontProvider etc. And
replace it with a given FontProvider/Mapper at start of a document.

Kind regards,
Cornelis Hoeflake

RE: Font provider since PDFBOX-2842

Posted by Simon Steiner <si...@gmail.com>.
Hi,

There is
https://issues.apache.org/jira/browse/PDFBOX-2539

Thanks

-----Original Message-----
From: Cornelis Hoeflake [mailto:c.hoeflake@postex.com] 
Sent: 24 August 2015 10:20
To: dev@pdfbox.apache.org
Subject: Font provider since PDFBOX-2842

Hi,

In the before PDFBOX-2842 situation we set the FontProvider on ExternalFonts to a thread bound font provider (uses ThreadLocal).
This is done because we have a systemen where multiple customers which have their own fonts. That fonts could also dynamically added to the system at runtime. We have implemented the FontProvider so that it looks in the database for a font request.

In the new situation the Mapper reads the font information once (at
setProvider) and uses this global for the whole system.

How can we create a situation like we had but then with the new code? I do not see an option.
I think it is a good idea to drop static FontMapper, FontProvider etc. And replace it with a given FontProvider/Mapper at start of a document.

Kind regards,
Cornelis Hoeflake


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


Re: Font provider since PDFBOX-2842

Posted by Cornelis Hoeflake <c....@postex.com>.
Hi John,

For some reason my emailprogram hides your replies. So I missed your last
message. This week we were required to upgrade to the latest 2.0 code. But
we could not upgrade due to the original issue I described.

So I was very lucky to see your last reply when I was searching for this
thread. Now we have implemented our own FontMapper with a ThreadLocal and
it works like charm!

The only thing we have to do is to implement the getCIDFont method. That
method is very complex in the FontMapperImpl. Wat about moving that logic
in a separate class which could be reused? Especially I mean the font
matching part of the method.

Many thanks!

Kind regards,
Cornelis Hoeflake

Met vriendelijke groet,

​

*Cornelis Hoeflake*  |  CTO
Postex Nederland B.V.
Postbus 70466 | 1007 KL Amsterdam
T. 088 - 07 07 400  |  M. 06 - 18 684806  |  www.postex.com
<https://twitter.com/Postex_NL> ​ <https://www.linkedin.com/company/postex>


The information in this message is confidential and may be legally
privileged. It is intended solely for the addressee. Access to this message
by anyone else is unauthorized.
If you are not the intended recipient, any disclosure, copying, or
distribution of the message, or any action or omission taken by you in
reliance on it, is prohibited and may be unlawful.
Please immediately contact the sender if you have received this message in
error.

2015-09-28 5:49 GMT+02:00 John Hewson <jo...@jahewson.com>:

> Hi,
>
> I have a new solution for this problem. I’m going to make FontMapper into
> an interface and
> have a singleton instance available via FontMappers.instance(). You’ll be
> able to provide
> your own implementation via FontMappers.set(FontMapper) and the current
> FontMapper
> code will be moved to FontMapperImpl, which will be provided as a static
> default when
> FontMappers.instance() is called for the first time without a prior call
> to FontMappers.set.
>
> The FontMapper interface simply contains the following three methods:
>
> FontMapping<TrueTypeFont> getTrueTypeFont(String baseFont,
> PDFontDescriptor fontDescriptor);
> FontMapping<FontBoxFont> getFontBoxFont(String baseFont, PDFontDescriptor
> fontDescriptor);
> CIDFontMapping getCIDFont(String baseFont, PDFontDescriptor
> fontDescriptor, PDCIDSystemInfo cidSystemInfo);
>
> You can put whatever implementation you like in there.
>
> See PDFBOX-2997 for more details.
>
> — John
>
> P.S. I’ve looked into having a per-document FontMapper or similar but in
> PDFBox fonts are not
> tied to a specific document, so it just won’t work. ThreadLocal lets you
> get round that though,
> given that PDDocument is single-threaded anyway. So you should be all set.
>
> > On 4 Sep 2015, at 03:22, Cornelis Hoeflake <c....@postex.com>
> wrote:
> >
> >> Hi,
> >>
> >> Sorry, please can you answer all of the questions in my previous mail
> or I
> >>> can’t help you.
> >>>
> >>
> >> Not sure whether I need any help right now. All I wanted to do is to
> vote
> >> for the "per-document FontMapper or FontProvider" solution and explain
> some
> >> reasons for that.
> >>
> >
> > I'm still searching for a solution, so any help is welcome!
> >
> >
> >> Why don’t your customers embed such fonts?
> >>>
> >>
> >> Because they have the PDFs in an archive created 20 years ago or they
> get
> >> the PDF from a funny marketing department which they are not able to
> teach
> >> anything or ... For all kinds of reasons.
> >
> >
> > Same here. And another case is that the customers now send their PDF's
> to a
> > printing house. Some printing houses don't like embedded fonts because
> they
> > take more memory on their printer.
> >
> > Are you talking about providing “desired” mappings, e.g. allowing
> >>> “Helvetica” to be mapped to the customer’s own “Helvetica.ttf” or are
> you
> >>> talking about support for custom fonts, e.g. “MyCorporateFont.ttf”?
> >>>
> >>
> >> Both.
> >
> >
> > Both. Really? Yes
> >
> >
> >> Are you wanting to customise the substation behaviour, or just provide
> >>> additional font files?
> >>>
> >>
> >> Both, although in most use cases the need is just to provide
> >> “MyCorporateFont.ttf” for the font "MyCorporateFont" that is referenced
> >> from PDFs produced by some reporting tool or whatever fancy source of
> >> external PDFs. In our configuration, one can setup font name aliases,
> which
> >> could be used to avoid the heuristics done in FontMapper.findFont. We
> use
> >> it that way in our PDFBox 1.7 based solution.
> >
> >
> > Both
> >
> > Initialising FileSystemFontProvider can take 10 seconds or more, so it’s
> >>> not practical to create a new one for each document.
> >>>
> >>
> >> Just to avoid misunderstanding of my previous comment on this. I do care
> >> about the speed of a document rendering of course. But
> >> FileSystemFontProvider is of no use for me. I have to initialize the
> font
> >> provider myself, from our configuration and from fonts in our database.
> I
> >> do not want to do that per document, but I have to do that again each
> time
> >> when the configuration changes. No matter whether frequentyl or just
> from
> >> time to time, but for sure not just once per lifetime of a JVM.
> >>
> >
> > In my case, I do not want to initialize a FontProvider which takes 10
> > seconds and does all that matching algorithms you named. I want to
> > substitute the FontProvider/Mapper with a custom one where I can do that
> > myself. We are not looking for a best match, but for the only perfect
> > match. And if there is no perfect match, the system has to throw an
> error.
> > We render thousands of PDFs of hundreds of customers in a short time
> range.
> > So memory wise we will not be happy if we have to instantiate hundreds
> > FontProviders/Mappers which all have loaded fonts and keep them in memory
> > because the startup costs are to high.
> >
> > A per document solution is in my opinion the holy grail. But as John said
> > that will brake a lot of API's, i agree. But a solution like we had on
> 2.0
> > some weeks ago is also working for me (which I implemented with a
> > ThreadLocal) and without the best-fit font matching algorithms.
> >
> >
> > Kind regards,
> > Cornelis
>
>

Re: Font provider since PDFBOX-2842

Posted by John Hewson <jo...@jahewson.com>.
Hi,

I have a new solution for this problem. I’m going to make FontMapper into an interface and
have a singleton instance available via FontMappers.instance(). You’ll be able to provide
your own implementation via FontMappers.set(FontMapper) and the current FontMapper
code will be moved to FontMapperImpl, which will be provided as a static default when
FontMappers.instance() is called for the first time without a prior call to FontMappers.set.

The FontMapper interface simply contains the following three methods:

FontMapping<TrueTypeFont> getTrueTypeFont(String baseFont, PDFontDescriptor fontDescriptor);
FontMapping<FontBoxFont> getFontBoxFont(String baseFont, PDFontDescriptor fontDescriptor);
CIDFontMapping getCIDFont(String baseFont, PDFontDescriptor fontDescriptor, PDCIDSystemInfo cidSystemInfo);

You can put whatever implementation you like in there.

See PDFBOX-2997 for more details.

— John

P.S. I’ve looked into having a per-document FontMapper or similar but in PDFBox fonts are not
tied to a specific document, so it just won’t work. ThreadLocal lets you get round that though,
given that PDDocument is single-threaded anyway. So you should be all set.

> On 4 Sep 2015, at 03:22, Cornelis Hoeflake <c....@postex.com> wrote:
> 
>> Hi,
>> 
>> Sorry, please can you answer all of the questions in my previous mail or I
>>> can’t help you.
>>> 
>> 
>> Not sure whether I need any help right now. All I wanted to do is to vote
>> for the "per-document FontMapper or FontProvider" solution and explain some
>> reasons for that.
>> 
> 
> I'm still searching for a solution, so any help is welcome!
> 
> 
>> Why don’t your customers embed such fonts?
>>> 
>> 
>> Because they have the PDFs in an archive created 20 years ago or they get
>> the PDF from a funny marketing department which they are not able to teach
>> anything or ... For all kinds of reasons.
> 
> 
> Same here. And another case is that the customers now send their PDF's to a
> printing house. Some printing houses don't like embedded fonts because they
> take more memory on their printer.
> 
> Are you talking about providing “desired” mappings, e.g. allowing
>>> “Helvetica” to be mapped to the customer’s own “Helvetica.ttf” or are you
>>> talking about support for custom fonts, e.g. “MyCorporateFont.ttf”?
>>> 
>> 
>> Both.
> 
> 
> Both. Really? Yes
> 
> 
>> Are you wanting to customise the substation behaviour, or just provide
>>> additional font files?
>>> 
>> 
>> Both, although in most use cases the need is just to provide
>> “MyCorporateFont.ttf” for the font "MyCorporateFont" that is referenced
>> from PDFs produced by some reporting tool or whatever fancy source of
>> external PDFs. In our configuration, one can setup font name aliases, which
>> could be used to avoid the heuristics done in FontMapper.findFont. We use
>> it that way in our PDFBox 1.7 based solution.
> 
> 
> Both
> 
> Initialising FileSystemFontProvider can take 10 seconds or more, so it’s
>>> not practical to create a new one for each document.
>>> 
>> 
>> Just to avoid misunderstanding of my previous comment on this. I do care
>> about the speed of a document rendering of course. But
>> FileSystemFontProvider is of no use for me. I have to initialize the font
>> provider myself, from our configuration and from fonts in our database. I
>> do not want to do that per document, but I have to do that again each time
>> when the configuration changes. No matter whether frequentyl or just from
>> time to time, but for sure not just once per lifetime of a JVM.
>> 
> 
> In my case, I do not want to initialize a FontProvider which takes 10
> seconds and does all that matching algorithms you named. I want to
> substitute the FontProvider/Mapper with a custom one where I can do that
> myself. We are not looking for a best match, but for the only perfect
> match. And if there is no perfect match, the system has to throw an error.
> We render thousands of PDFs of hundreds of customers in a short time range.
> So memory wise we will not be happy if we have to instantiate hundreds
> FontProviders/Mappers which all have loaded fonts and keep them in memory
> because the startup costs are to high.
> 
> A per document solution is in my opinion the holy grail. But as John said
> that will brake a lot of API's, i agree. But a solution like we had on 2.0
> some weeks ago is also working for me (which I implemented with a
> ThreadLocal) and without the best-fit font matching algorithms.
> 
> 
> Kind regards,
> Cornelis


Re: Font provider since PDFBOX-2842

Posted by Cornelis Hoeflake <c....@postex.com>.
> Hi,
>
> Sorry, please can you answer all of the questions in my previous mail or I
>> can’t help you.
>>
>
> Not sure whether I need any help right now. All I wanted to do is to vote
> for the "per-document FontMapper or FontProvider" solution and explain some
> reasons for that.
>

I'm still searching for a solution, so any help is welcome!


> Why don’t your customers embed such fonts?
>>
>
> Because they have the PDFs in an archive created 20 years ago or they get
> the PDF from a funny marketing department which they are not able to teach
> anything or ... For all kinds of reasons.


Same here. And another case is that the customers now send their PDF's to a
printing house. Some printing houses don't like embedded fonts because they
take more memory on their printer.

Are you talking about providing “desired” mappings, e.g. allowing
>> “Helvetica” to be mapped to the customer’s own “Helvetica.ttf” or are you
>> talking about support for custom fonts, e.g. “MyCorporateFont.ttf”?
>>
>
> Both.


Both. Really? Yes


> Are you wanting to customise the substation behaviour, or just provide
>> additional font files?
>>
>
> Both, although in most use cases the need is just to provide
> “MyCorporateFont.ttf” for the font "MyCorporateFont" that is referenced
> from PDFs produced by some reporting tool or whatever fancy source of
> external PDFs. In our configuration, one can setup font name aliases, which
> could be used to avoid the heuristics done in FontMapper.findFont. We use
> it that way in our PDFBox 1.7 based solution.


Both

 Initialising FileSystemFontProvider can take 10 seconds or more, so it’s
>> not practical to create a new one for each document.
>>
>
> Just to avoid misunderstanding of my previous comment on this. I do care
> about the speed of a document rendering of course. But
> FileSystemFontProvider is of no use for me. I have to initialize the font
> provider myself, from our configuration and from fonts in our database. I
> do not want to do that per document, but I have to do that again each time
> when the configuration changes. No matter whether frequentyl or just from
> time to time, but for sure not just once per lifetime of a JVM.
>

In my case, I do not want to initialize a FontProvider which takes 10
seconds and does all that matching algorithms you named. I want to
substitute the FontProvider/Mapper with a custom one where I can do that
myself. We are not looking for a best match, but for the only perfect
match. And if there is no perfect match, the system has to throw an error.
We render thousands of PDFs of hundreds of customers in a short time range.
So memory wise we will not be happy if we have to instantiate hundreds
FontProviders/Mappers which all have loaded fonts and keep them in memory
because the startup costs are to high.

A per document solution is in my opinion the holy grail. But as John said
that will brake a lot of API's, i agree. But a solution like we had on 2.0
some weeks ago is also working for me (which I implemented with a
ThreadLocal) and without the best-fit font matching algorithms.


Kind regards,
Cornelis

Re: Font provider since PDFBOX-2842

Posted by Leonard Rosenthol <lr...@adobe.com>.
Just a reminder from your favorite standards person…

While it is permissible (well, not mentioned as such) to substitute a local font for an embedded font in ISO 32000 (aka regular PDF), it is FORBIDDEN in all of the subset standards.  So if you are going to be rendering PDF/A, PDF/X, etc. - you MUST use the embedded font program.

Leonard




On 9/3/15, 3:43 PM, "Petr Slabý" <sl...@kadel.cz> wrote:

>Hi,
>
>> Sorry, please can you answer all of the questions in my previous mail or I 
>> can’t help you.
>
>Not sure whether I need any help right now. All I wanted to do is to vote 
>for the "per-document FontMapper or FontProvider" solution and explain some 
>reasons for that.
>
>Now searching for question marks in your previous mail:
>
>> So you have fonts for specific customers for rendering only their 
>> documents and you expect those to change frequently?
>
>Not frequently, but yes, they can change. The frequency does not matter then 
>as the change has to be immediate while the application server cluster is 
>restarted only once or twice a year.
>
>> Really?
>
>Yes.
>
>> Why don’t your customers embed such fonts?
>
>Because they have the PDFs in an archive created 20 years ago or they get 
>the PDF from a funny marketing department which they are not able to teach 
>anything or ... For all kinds of reasons.
>
>> Are you talking about providing “desired” mappings, e.g. allowing 
>> “Helvetica” to be mapped to the customer’s own “Helvetica.ttf” or are you 
>> talking about support for custom fonts, e.g. “MyCorporateFont.ttf”?
>
>Both.
>
>> Are you wanting to customise the substation behaviour, or just provide 
>> additional font files?
>
>Both, although in most use cases the need is just to provide 
>“MyCorporateFont.ttf” for the font "MyCorporateFont" that is referenced from 
>PDFs produced by some reporting tool or whatever fancy source of external 
>PDFs. In our configuration, one can setup font name aliases, which could be 
>used to avoid the heuristics done in FontMapper.findFont. We use it that way 
>in our PDFBox 1.7 based solution.
>
>>  Initialising FileSystemFontProvider can take 10 seconds or more, so it’s 
>> not practical to create a new one for each document.
>
>Just to avoid misunderstanding of my previous comment on this. I do care 
>about the speed of a document rendering of course. But 
>FileSystemFontProvider is of no use for me. I have to initialize the font 
>provider myself, from our configuration and from fonts in our database. I do 
>not want to do that per document, but I have to do that again each time when 
>the configuration changes. No matter whether frequentyl or just from time to 
>time, but for sure not just once per lifetime of a JVM.
>
>Best regards,
>Petr.
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
>For additional commands, e-mail: dev-help@pdfbox.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


Re: Font provider since PDFBOX-2842

Posted by Petr Slabý <sl...@kadel.cz>.
Hi,

> Sorry, please can you answer all of the questions in my previous mail or I 
> can’t help you.

Not sure whether I need any help right now. All I wanted to do is to vote 
for the "per-document FontMapper or FontProvider" solution and explain some 
reasons for that.

Now searching for question marks in your previous mail:

> So you have fonts for specific customers for rendering only their 
> documents and you expect those to change frequently?

Not frequently, but yes, they can change. The frequency does not matter then 
as the change has to be immediate while the application server cluster is 
restarted only once or twice a year.

> Really?

Yes.

> Why don’t your customers embed such fonts?

Because they have the PDFs in an archive created 20 years ago or they get 
the PDF from a funny marketing department which they are not able to teach 
anything or ... For all kinds of reasons.

> Are you talking about providing “desired” mappings, e.g. allowing 
> “Helvetica” to be mapped to the customer’s own “Helvetica.ttf” or are you 
> talking about support for custom fonts, e.g. “MyCorporateFont.ttf”?

Both.

> Are you wanting to customise the substation behaviour, or just provide 
> additional font files?

Both, although in most use cases the need is just to provide 
“MyCorporateFont.ttf” for the font "MyCorporateFont" that is referenced from 
PDFs produced by some reporting tool or whatever fancy source of external 
PDFs. In our configuration, one can setup font name aliases, which could be 
used to avoid the heuristics done in FontMapper.findFont. We use it that way 
in our PDFBox 1.7 based solution.

>  Initialising FileSystemFontProvider can take 10 seconds or more, so it’s 
> not practical to create a new one for each document.

Just to avoid misunderstanding of my previous comment on this. I do care 
about the speed of a document rendering of course. But 
FileSystemFontProvider is of no use for me. I have to initialize the font 
provider myself, from our configuration and from fonts in our database. I do 
not want to do that per document, but I have to do that again each time when 
the configuration changes. No matter whether frequentyl or just from time to 
time, but for sure not just once per lifetime of a JVM.

Best regards,
Petr.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


Re: Font provider since PDFBOX-2842

Posted by John Hewson <jo...@jahewson.com>.
Sorry, please can you answer all of the questions in my previous mail or I can’t help you.

— John

> On 2 Sep 2015, at 03:03, Petr Slabý <sl...@kadel.cz> wrote:
> 
>> As with Cornelis, I’d direct you to look at jimfs, which is an in- memory filesystem for Java
> 
> That is not needed, I think. All I need to do is to implement FontProvider on top of our internal resource provider API which reads the stuff from the database. But I need multiple instances of FontMapper/FontProvider in one JVM (because more than one application/customer  can be running in the application server instance) and I need to be able to flush the FontMapper if the configuration changes and the FontProvider has new files.
> 
>> So you have fonts for specific customers for rendering only their documents and you expect those to change frequently?
>> Really? Why don’t your customers embed such fonts?  ...
> 
> Customers do all kind of wild things. Most usually, the PDFs we are dealing with are supposed to be PDF/A, so there is no problem. But I have already seen PDFs in a company's archive which were using external fonts. Or customers with an old application nobody is able to touch, producing such PDFs. I cannot send the customer to hell then.
> 
> The fonts do not change frequently, of course. But as the Universe wants it, a missing font or wrong configuration is never found on a test system. It happens somewhere in the production and then it is really impossible to restart the JVM.
> 
> In any case, external fonts are possible and in a perfect solution I need to cover the two previously mentioned use cases - multiple FontMapper instances in a single JVM and a change of fonts available to a FontMapper after it has already been initialized.
> 
> As a minimal solution, it should be enough for me if I can re-initialize the FontMapper using a new instance of FontProvider. It will be tricky to make it thread safe, but I hope I will manage to do that using a "readers–writer lock" pattern. This does not cover the multiple different configurations case very well (or at all because only a single configuration can be used for concurrent renderings), but at least enables to change the configuration at runtime. Maybe this should already work if I use FontMapper.setProvider()?
> 
> Unfortunately, PDFBox 2.0 release escaped our own release cycle by one year once again, so I will have no resources to really try to implement against the new version until somewhere in the next year. If you say having a non-static FontMapper would break all the APIs and postpone the 2.0 release by yet another year, I will rather live with what we have now.
> 
> Best regards,
> Petr.
> 
> -----Původní zpráva----- From: John Hewson
> Sent: Wednesday, September 02, 2015 4:43 AM
> To: dev@pdfbox.apache.org <ma...@pdfbox.apache.org>
> Subject: Re: Font provider since PDFBOX-2842
> 
> 
>> On 1 Sep 2015, at 06:12, Petr Slabý <sl...@kadel.cz> wrote:
>> 
>> John,
>> I am facing basically the same issues as Cornelis.
>> 
>> Our application is running in a J2EE application server, so  installing anything into the operation system is usually not an option. It is already difficult in a cluster environment managed by the customer, but becomes virtually impossible in a dynamic cloud. Everything has to come from a database.
> 
> As with Cornelis, I’d direct you to look at jimfs, https://github.com/google/jimfs <https://github.com/google/jimfs> <https://github.com/google/jimfs <https://github.com/google/jimfs>> which is an in- memory filesystem for Java. You could pull fonts from your database into there. Maybe one directory for each customer.
> 
>> Some of our customers are (document) service providers working for a bunch of other companies. There it is important to be able to use different configurations for different groups of documents, because of licencing and other reasons.
> 
> So you have fonts for specific customers for rendering only their documents and you expect those to change frequently? Really? Why don’t your customers embed such fonts?
> 
> Are you talking about providing “desired” mappings, e.g. allowing “Helvetica” to be mapped to the customer’s own “Helvetica.ttf” or are you talking about support for custom fonts, e.g. “MyCorporateFont.ttf”?
> 
> Are you wanting to customise the substation behaviour, or just provide additional font files?
> 
>> Last but not least, the configuration (available fonts) can be changed by storing some new resources into the database. Restarting the application server (cluster) - so that a static class gets instantiated again and reads the new configuration - is no option.
> 
> Yes, obviously you don’t want to have to restart. So we do need some way in PDFBox to allow the FontMapper to be re-initialized at runtime. Note that PDFBox no longer requests one font at a time, it requests all fonts because we need to examine all fonts to find the best match. You’re also going to need to store all of the font metadata in your database, so that your custom FontProvider can pull all of it when it’s initialised. That means all of the fields in the FontInfo class need to be present in your database.
> 
>>> Initialising FileSystemFontProvider can take 10 seconds or more, so it’s
>>> not practical to create a new one for each document.
>> I do not care :-) I am not reading file system (usually not allowed to anyway).  And I know the place where to initialize and cache the fonts in my application.
> 
> Everyone else does care if it takes 10sec to open every PDDocument but...
> 
>>> We could switch to having a per-document FontMapper or FontProvider, with the default being to use a shared static provider
>> Would be perfect, I think.
> 
> … this mitigates that issue, so it should be fine.
> 
> — John
> 
>> Best regards,
>> Petr
>> 
>> -----Original message----- From: Cornelis Hoeflake
>> Sent: Tuesday, September 01, 2015 2:26 PM
>> To: dev@pdfbox.apache.org
>> Subject: Re: Font provider since PDFBOX-2842
>> 
>> Sorry for my delayed reply, I missed your reply for some reason...
>> 
>> 2015-08-24 20:54 GMT+02:00 John Hewson <jo...@jahewson.com>:
>> 
>>> Hi Cornelis,
>>> 
>>> > On 24 Aug 2015, at 02:20, Cornelis Hoeflake <c....@postex.com>
>>> wrote:
>>> >
>>> > Hi,
>>> >
>>> > In the before PDFBOX-2842 situation we set the FontProvider on
>>> > ExternalFonts to a thread bound font provider (uses ThreadLocal).
>>> > This is done because we have a systemen where multiple customers which
>>> have
>>> > their own fonts. That fonts could also dynamically added to the system >  > at
>>> > runtime. We have implemented the FontProvider so that it looks in the
>>> > database for a font request.
>>> 
>>> My first question is: do you really need this? What fonts are your users
>>> uploading and
>>> why are they missing from PDFs? Could you make them available in some
>>> other way?
>>> Do the fonts really need to be locked down to a given user? Why not keep
>>> it simple and
>>> copy the font files to the local system?
>>> 
>> 
>> My customers have licensed (and custom made) fonts. So the first issue is
>> that a license for customer 1 is not valid for customer 2. The next problem
>> is dat the name of custom made fonts does not have to be globally unique.
>> The last problem is that the server environment is provisioned on services
>> like Amazon Elastic Beanstalk wit autoscaling etc. In that case there is no
>> decent option to copy fonts to the system.
>> 
>> 
>>> > In the new situation FontMapper reads the font information once (at
>>> > setProvider) and uses this global for the whole system.
>>> 
>>> The old approach, ExternalFonts used the font name to perform a direct
>>> lookup,
>>> delegating this to the FontProvider. The new FontMapper performs a best-fit
>>> lookup using multiple attributes such as the name, ROS, weight, family,
>>> unicode
>>> ranges, style, and panose classification. This requires that we first
>>> build an index
>>> of those attributes for each font.
>>> 
>> 
>> Ok, that is a nice feature. But for my case customers want the correct
>> font, not a best-fit.
>> 
>> 
>>> > How can we create a situation like we had but then with the new code? I
>>> do
>>> > not see an option.
>>> 
>>> Could you explain how you’re currently using ThreadLocal? There might be a
>>> workaround. Failing that we could provide a mechanism to allow the font
>>> index
>>> to be updated dynamically.
>>> 
>> 
>> I have set a custom made 'general' FontProvider. Before doing any operation
>> which uses ExternalFonts.getProvider(), I set a customer FontProvider
>> (which knows all about the fonts of that customer) in a ThreadLocal in de
>> 'general' FontProvider. The 'general' FontProvider delegates each request
>> to the customer FontProvider.
>> 
>> 
>>> > I think it is a good idea to drop static FontMapper, FontProvider etc.
>>> And
>>> > replace it with a given FontProvider/Mapper at start of a document.
>>> 
>>> Initialising FileSystemFontProvider can take 10 seconds or more, so it’s
>>> not practical
>>> to create a new one for each document. We could switch to having a
>>> per-document
>>> FontMapper or FontProvider, with the default being to use a shared static
>>> provider,
>>> with the user being able to set their own, however we have static APIs
>>> which require
>>> a FontMapper to exist independently from a document, namely:
>>> 
>>>       - all PDFont subclass constructors
>>>       - PDType1Font constants (TIMES_ROMAN, HELVETICA, COURIER, etc.)
>>> 
>>> I don’t see any way to make the changes you’re after without breaking all
>>> those APIs.
>>> 
>> 
>> Yes, that is true... On the other hand, more and more server software
>> products are switching to services like Amazon Elastic Beanstalk etc. And
>> last but not least, the API for FontProvider/Mapper and the previous
>> ExternalFonts is already broken. Withint 2.0 and between 1.8 and 2.0.
>> 
>> 
>>> — John
>>> 
>>> > Kind regards,
>>> > Cornelis Hoeflake
>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
>>> For additional commands, e-mail: dev-help@pdfbox.apache.org
>>> 
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
>> For additional commands, e-mail: dev-help@pdfbox.apache.org
>> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org <ma...@pdfbox.apache.org>
> For additional commands, e-mail: dev-help@pdfbox.apache.org <ma...@pdfbox.apache.org>

Re: Font provider since PDFBOX-2842

Posted by Petr Slabý <sl...@kadel.cz>.
> As with Cornelis, I’d direct you to look at jimfs, which is an in- memory 
> filesystem for Java

That is not needed, I think. All I need to do is to implement FontProvider 
on top of our internal resource provider API which reads the stuff from the 
database. But I need multiple instances of FontMapper/FontProvider in one 
JVM (because more than one application/customer  can be running in the 
application server instance) and I need to be able to flush the FontMapper 
if the configuration changes and the FontProvider has new files.

> So you have fonts for specific customers for rendering only their 
> documents and you expect those to change frequently?
> Really? Why don’t your customers embed such fonts?  ...

Customers do all kind of wild things. Most usually, the PDFs we are dealing 
with are supposed to be PDF/A, so there is no problem. But I have already 
seen PDFs in a company's archive which were using external fonts. Or 
customers with an old application nobody is able to touch, producing such 
PDFs. I cannot send the customer to hell then.

The fonts do not change frequently, of course. But as the Universe wants it, 
a missing font or wrong configuration is never found on a test system. It 
happens somewhere in the production and then it is really impossible to 
restart the JVM.

In any case, external fonts are possible and in a perfect solution I need to 
cover the two previously mentioned use cases - multiple FontMapper instances 
in a single JVM and a change of fonts available to a FontMapper after it has 
already been initialized.

As a minimal solution, it should be enough for me if I can re-initialize the 
FontMapper using a new instance of FontProvider. It will be tricky to make 
it thread safe, but I hope I will manage to do that using a "readers–writer 
lock" pattern. This does not cover the multiple different configurations 
case very well (or at all because only a single configuration can be used 
for concurrent renderings), but at least enables to change the configuration 
at runtime. Maybe this should already work if I use 
FontMapper.setProvider()?

Unfortunately, PDFBox 2.0 release escaped our own release cycle by one year 
once again, so I will have no resources to really try to implement against 
the new version until somewhere in the next year. If you say having a 
non-static FontMapper would break all the APIs and postpone the 2.0 release 
by yet another year, I will rather live with what we have now.

Best regards,
Petr.

-----Původní zpráva----- 
From: John Hewson
Sent: Wednesday, September 02, 2015 4:43 AM
To: dev@pdfbox.apache.org
Subject: Re: Font provider since PDFBOX-2842


> On 1 Sep 2015, at 06:12, Petr Slabý <sl...@kadel.cz> wrote:
>
> John,
> I am facing basically the same issues as Cornelis.
>
> Our application is running in a J2EE application server, so  installing 
> anything into the operation system is usually not an option. It is already 
> difficult in a cluster environment managed by the customer, but becomes 
> virtually impossible in a dynamic cloud. Everything has to come from a 
> database.

As with Cornelis, I’d direct you to look at jimfs, 
https://github.com/google/jimfs <https://github.com/google/jimfs> which is 
an in- memory filesystem for Java. You could pull fonts from your database 
into there. Maybe one directory for each customer.

>  Some of our customers are (document) service providers working for a 
> bunch of other companies. There it is important to be able to use 
> different configurations for different groups of documents, because of 
> licencing and other reasons.

So you have fonts for specific customers for rendering only their documents 
and you expect those to change frequently? Really? Why don’t your customers 
embed such fonts?

Are you talking about providing “desired” mappings, e.g. allowing 
 “Helvetica” to be mapped to the customer’s own “Helvetica.ttf” or are you 
talking about support for custom fonts, e.g. “MyCorporateFont.ttf”?

Are you wanting to customise the substation behaviour, or just provide 
additional font files?

> Last but not least, the configuration (available fonts) can be changed by 
> storing some new resources into the database. Restarting the application 
> server (cluster) - so that a static class gets instantiated again and 
> reads the new configuration - is no option.

Yes, obviously you don’t want to have to restart. So we do need some way in 
PDFBox to allow the FontMapper to be re-initialized at runtime. Note that 
PDFBox no longer requests one font at a time, it requests all fonts because 
we need to examine all fonts to find the best match. You’re also going to 
need to store all of the font metadata in your database, so that your custom 
FontProvider can pull all of it when it’s initialised. That means all of the 
fields in the FontInfo class need to be present in your database.

>> Initialising FileSystemFontProvider can take 10 seconds or more, so it’s
>> not practical to create a new one for each document.
> I do not care :-) I am not reading file system (usually not allowed to 
> anyway).  And I know the place where to initialize and cache the fonts in 
> my application.

Everyone else does care if it takes 10sec to open every PDDocument but...

>> We could switch to having a per-document FontMapper or FontProvider, with 
>> the default being to use a shared static provider
> Would be perfect, I think.

… this mitigates that issue, so it should be fine.

— John

> Best regards,
> Petr
>
> -----Original message----- From: Cornelis Hoeflake
> Sent: Tuesday, September 01, 2015 2:26 PM
> To: dev@pdfbox.apache.org
> Subject: Re: Font provider since PDFBOX-2842
>
> Sorry for my delayed reply, I missed your reply for some reason...
>
> 2015-08-24 20:54 GMT+02:00 John Hewson <jo...@jahewson.com>:
>
>> Hi Cornelis,
>>
>> > On 24 Aug 2015, at 02:20, Cornelis Hoeflake <c....@postex.com>
>> wrote:
>> >
>> > Hi,
>> >
>> > In the before PDFBOX-2842 situation we set the FontProvider on
>> > ExternalFonts to a thread bound font provider (uses ThreadLocal).
>> > This is done because we have a systemen where multiple customers which
>> have
>> > their own fonts. That fonts could also dynamically added to the system 
>> >  > at
>> > runtime. We have implemented the FontProvider so that it looks in the
>> > database for a font request.
>>
>> My first question is: do you really need this? What fonts are your users
>> uploading and
>> why are they missing from PDFs? Could you make them available in some
>> other way?
>> Do the fonts really need to be locked down to a given user? Why not keep
>> it simple and
>> copy the font files to the local system?
>>
>
> My customers have licensed (and custom made) fonts. So the first issue is
> that a license for customer 1 is not valid for customer 2. The next 
> problem
> is dat the name of custom made fonts does not have to be globally unique.
> The last problem is that the server environment is provisioned on services
> like Amazon Elastic Beanstalk wit autoscaling etc. In that case there is 
> no
> decent option to copy fonts to the system.
>
>
>> > In the new situation FontMapper reads the font information once (at
>> > setProvider) and uses this global for the whole system.
>>
>> The old approach, ExternalFonts used the font name to perform a direct
>> lookup,
>> delegating this to the FontProvider. The new FontMapper performs a 
>> best-fit
>> lookup using multiple attributes such as the name, ROS, weight, family,
>> unicode
>> ranges, style, and panose classification. This requires that we first
>> build an index
>> of those attributes for each font.
>>
>
> Ok, that is a nice feature. But for my case customers want the correct
> font, not a best-fit.
>
>
>> > How can we create a situation like we had but then with the new code? I
>> do
>> > not see an option.
>>
>> Could you explain how you’re currently using ThreadLocal? There might be 
>> a
>> workaround. Failing that we could provide a mechanism to allow the font
>> index
>> to be updated dynamically.
>>
>
> I have set a custom made 'general' FontProvider. Before doing any 
> operation
> which uses ExternalFonts.getProvider(), I set a customer FontProvider
> (which knows all about the fonts of that customer) in a ThreadLocal in de
> 'general' FontProvider. The 'general' FontProvider delegates each request
> to the customer FontProvider.
>
>
>> > I think it is a good idea to drop static FontMapper, FontProvider etc.
>> And
>> > replace it with a given FontProvider/Mapper at start of a document.
>>
>> Initialising FileSystemFontProvider can take 10 seconds or more, so it’s
>> not practical
>> to create a new one for each document. We could switch to having a
>> per-document
>> FontMapper or FontProvider, with the default being to use a shared static
>> provider,
>> with the user being able to set their own, however we have static APIs
>> which require
>> a FontMapper to exist independently from a document, namely:
>>
>>        - all PDFont subclass constructors
>>        - PDType1Font constants (TIMES_ROMAN, HELVETICA, COURIER, etc.)
>>
>> I don’t see any way to make the changes you’re after without breaking all
>> those APIs.
>>
>
> Yes, that is true... On the other hand, more and more server software
> products are switching to services like Amazon Elastic Beanstalk etc. And
> last but not least, the API for FontProvider/Mapper and the previous
> ExternalFonts is already broken. Withint 2.0 and between 1.8 and 2.0.
>
>
>> — John
>>
>> > Kind regards,
>> > Cornelis Hoeflake
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
>> For additional commands, e-mail: dev-help@pdfbox.apache.org
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: dev-help@pdfbox.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


Re: Font provider since PDFBOX-2842

Posted by John Hewson <jo...@jahewson.com>.
> On 1 Sep 2015, at 06:12, Petr Slabý <sl...@kadel.cz> wrote:
> 
> John,
> I am facing basically the same issues as Cornelis.
> 
> Our application is running in a J2EE application server, so  installing anything into the operation system is usually not an option. It is already difficult in a cluster environment managed by the customer, but becomes virtually impossible in a dynamic cloud. Everything has to come from a database.

As with Cornelis, I’d direct you to look at jimfs, https://github.com/google/jimfs <https://github.com/google/jimfs> which is an in- memory filesystem for Java. You could pull fonts from your database into there. Maybe one directory for each customer.

>  Some of our customers are (document) service providers working for a bunch of other companies. There it is important to be able to use different configurations for different groups of documents, because of licencing and other reasons.

So you have fonts for specific customers for rendering only their documents and you expect those to change frequently? Really? Why don’t your customers embed such fonts?

Are you talking about providing “desired” mappings, e.g. allowing “Helvetica” to be mapped to the customer’s own “Helvetica.ttf” or are you talking about support for custom fonts, e.g. “MyCorporateFont.ttf”?

Are you wanting to customise the substation behaviour, or just provide additional font files?

> Last but not least, the configuration (available fonts) can be changed by storing some new resources into the database. Restarting the application server (cluster) - so that a static class gets instantiated again and reads the new configuration - is no option.

Yes, obviously you don’t want to have to restart. So we do need some way in PDFBox to allow the FontMapper to be re-initialized at runtime. Note that PDFBox no longer requests one font at a time, it requests all fonts because we need to examine all fonts to find the best match. You’re also going to need to store all of the font metadata in your database, so that your custom FontProvider can pull all of it when it’s initialised. That means all of the fields in the FontInfo class need to be present in your database.

>> Initialising FileSystemFontProvider can take 10 seconds or more, so it’s
>> not practical to create a new one for each document.
> I do not care :-) I am not reading file system (usually not allowed to anyway).  And I know the place where to initialize and cache the fonts in my application.

Everyone else does care if it takes 10sec to open every PDDocument but...

>> We could switch to having a per-document FontMapper or FontProvider, with the default being to use a shared static provider
> Would be perfect, I think.

… this mitigates that issue, so it should be fine.

— John

> Best regards,
> Petr
> 
> -----Original message----- From: Cornelis Hoeflake
> Sent: Tuesday, September 01, 2015 2:26 PM
> To: dev@pdfbox.apache.org
> Subject: Re: Font provider since PDFBOX-2842
> 
> Sorry for my delayed reply, I missed your reply for some reason...
> 
> 2015-08-24 20:54 GMT+02:00 John Hewson <jo...@jahewson.com>:
> 
>> Hi Cornelis,
>> 
>> > On 24 Aug 2015, at 02:20, Cornelis Hoeflake <c....@postex.com>
>> wrote:
>> >
>> > Hi,
>> >
>> > In the before PDFBOX-2842 situation we set the FontProvider on
>> > ExternalFonts to a thread bound font provider (uses ThreadLocal).
>> > This is done because we have a systemen where multiple customers which
>> have
>> > their own fonts. That fonts could also dynamically added to the system > at
>> > runtime. We have implemented the FontProvider so that it looks in the
>> > database for a font request.
>> 
>> My first question is: do you really need this? What fonts are your users
>> uploading and
>> why are they missing from PDFs? Could you make them available in some
>> other way?
>> Do the fonts really need to be locked down to a given user? Why not keep
>> it simple and
>> copy the font files to the local system?
>> 
> 
> My customers have licensed (and custom made) fonts. So the first issue is
> that a license for customer 1 is not valid for customer 2. The next problem
> is dat the name of custom made fonts does not have to be globally unique.
> The last problem is that the server environment is provisioned on services
> like Amazon Elastic Beanstalk wit autoscaling etc. In that case there is no
> decent option to copy fonts to the system.
> 
> 
>> > In the new situation FontMapper reads the font information once (at
>> > setProvider) and uses this global for the whole system.
>> 
>> The old approach, ExternalFonts used the font name to perform a direct
>> lookup,
>> delegating this to the FontProvider. The new FontMapper performs a best-fit
>> lookup using multiple attributes such as the name, ROS, weight, family,
>> unicode
>> ranges, style, and panose classification. This requires that we first
>> build an index
>> of those attributes for each font.
>> 
> 
> Ok, that is a nice feature. But for my case customers want the correct
> font, not a best-fit.
> 
> 
>> > How can we create a situation like we had but then with the new code? I
>> do
>> > not see an option.
>> 
>> Could you explain how you’re currently using ThreadLocal? There might be a
>> workaround. Failing that we could provide a mechanism to allow the font
>> index
>> to be updated dynamically.
>> 
> 
> I have set a custom made 'general' FontProvider. Before doing any operation
> which uses ExternalFonts.getProvider(), I set a customer FontProvider
> (which knows all about the fonts of that customer) in a ThreadLocal in de
> 'general' FontProvider. The 'general' FontProvider delegates each request
> to the customer FontProvider.
> 
> 
>> > I think it is a good idea to drop static FontMapper, FontProvider etc.
>> And
>> > replace it with a given FontProvider/Mapper at start of a document.
>> 
>> Initialising FileSystemFontProvider can take 10 seconds or more, so it’s
>> not practical
>> to create a new one for each document. We could switch to having a
>> per-document
>> FontMapper or FontProvider, with the default being to use a shared static
>> provider,
>> with the user being able to set their own, however we have static APIs
>> which require
>> a FontMapper to exist independently from a document, namely:
>> 
>>        - all PDFont subclass constructors
>>        - PDType1Font constants (TIMES_ROMAN, HELVETICA, COURIER, etc.)
>> 
>> I don’t see any way to make the changes you’re after without breaking all
>> those APIs.
>> 
> 
> Yes, that is true... On the other hand, more and more server software
> products are switching to services like Amazon Elastic Beanstalk etc. And
> last but not least, the API for FontProvider/Mapper and the previous
> ExternalFonts is already broken. Withint 2.0 and between 1.8 and 2.0.
> 
> 
>> — John
>> 
>> > Kind regards,
>> > Cornelis Hoeflake
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
>> For additional commands, e-mail: dev-help@pdfbox.apache.org
>> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: dev-help@pdfbox.apache.org
> 


Re: Font provider since PDFBOX-2842

Posted by Petr Slabý <sl...@kadel.cz>.
John,
I am facing basically the same issues as Cornelis.

Our application is running in a J2EE application server, so  installing 
anything into the operation system is usually not an option. It is already 
difficult in a cluster environment managed by the customer, but becomes 
virtually impossible in a dynamic cloud. Everything has to come from a 
database. Some of our customers are (document) service providers working for 
a bunch of other companies. There it is important to be able to use 
different configurations for different groups of documents, because of 
licencing and other reasons. Last but not least, the configuration 
(available fonts) can be changed by storing some new resources into the 
database. Restarting the application server (cluster) - so that a static 
class gets instantiated again and reads the new configuration - is no 
option.

> Initialising FileSystemFontProvider can take 10 seconds or more, so it’s
> not practical to create a new one for each document.
I do not care :-) I am not reading file system (usually not allowed to 
anyway).  And I know the place where to initialize and cache the fonts in my 
application.

> We could switch to having a per-document FontMapper or FontProvider, with 
> the default being to use a shared static provider
Would be perfect, I think.

Best regards,
Petr

-----Original message----- 
From: Cornelis Hoeflake
Sent: Tuesday, September 01, 2015 2:26 PM
To: dev@pdfbox.apache.org
Subject: Re: Font provider since PDFBOX-2842

Sorry for my delayed reply, I missed your reply for some reason...

2015-08-24 20:54 GMT+02:00 John Hewson <jo...@jahewson.com>:

> Hi Cornelis,
>
> > On 24 Aug 2015, at 02:20, Cornelis Hoeflake <c....@postex.com>
> wrote:
> >
> > Hi,
> >
> > In the before PDFBOX-2842 situation we set the FontProvider on
> > ExternalFonts to a thread bound font provider (uses ThreadLocal).
> > This is done because we have a systemen where multiple customers which
> have
> > their own fonts. That fonts could also dynamically added to the system 
> > at
> > runtime. We have implemented the FontProvider so that it looks in the
> > database for a font request.
>
> My first question is: do you really need this? What fonts are your users
> uploading and
> why are they missing from PDFs? Could you make them available in some
> other way?
> Do the fonts really need to be locked down to a given user? Why not keep
> it simple and
> copy the font files to the local system?
>

My customers have licensed (and custom made) fonts. So the first issue is
that a license for customer 1 is not valid for customer 2. The next problem
is dat the name of custom made fonts does not have to be globally unique.
The last problem is that the server environment is provisioned on services
like Amazon Elastic Beanstalk wit autoscaling etc. In that case there is no
decent option to copy fonts to the system.


> > In the new situation FontMapper reads the font information once (at
> > setProvider) and uses this global for the whole system.
>
> The old approach, ExternalFonts used the font name to perform a direct
> lookup,
> delegating this to the FontProvider. The new FontMapper performs a 
> best-fit
> lookup using multiple attributes such as the name, ROS, weight, family,
> unicode
> ranges, style, and panose classification. This requires that we first
> build an index
> of those attributes for each font.
>

Ok, that is a nice feature. But for my case customers want the correct
font, not a best-fit.


> > How can we create a situation like we had but then with the new code? I
> do
> > not see an option.
>
> Could you explain how you’re currently using ThreadLocal? There might be a
> workaround. Failing that we could provide a mechanism to allow the font
> index
> to be updated dynamically.
>

I have set a custom made 'general' FontProvider. Before doing any operation
which uses ExternalFonts.getProvider(), I set a customer FontProvider
(which knows all about the fonts of that customer) in a ThreadLocal in de
'general' FontProvider. The 'general' FontProvider delegates each request
to the customer FontProvider.


> > I think it is a good idea to drop static FontMapper, FontProvider etc.
> And
> > replace it with a given FontProvider/Mapper at start of a document.
>
> Initialising FileSystemFontProvider can take 10 seconds or more, so it’s
> not practical
> to create a new one for each document. We could switch to having a
> per-document
> FontMapper or FontProvider, with the default being to use a shared static
> provider,
> with the user being able to set their own, however we have static APIs
> which require
> a FontMapper to exist independently from a document, namely:
>
>         - all PDFont subclass constructors
>         - PDType1Font constants (TIMES_ROMAN, HELVETICA, COURIER, etc.)
>
> I don’t see any way to make the changes you’re after without breaking all
> those APIs.
>

Yes, that is true... On the other hand, more and more server software
products are switching to services like Amazon Elastic Beanstalk etc. And
last but not least, the API for FontProvider/Mapper and the previous
ExternalFonts is already broken. Withint 2.0 and between 1.8 and 2.0.


> — John
>
> > Kind regards,
> > Cornelis Hoeflake
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: dev-help@pdfbox.apache.org
>
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


Re: Font provider since PDFBOX-2842

Posted by John Hewson <jo...@jahewson.com>.
Hi Cornelis,

> On 1 Sep 2015, at 05:26, Cornelis Hoeflake <c....@postex.com> wrote:
> 
> Sorry for my delayed reply, I missed your reply for some reason...
> 
> 2015-08-24 20:54 GMT+02:00 John Hewson <jo...@jahewson.com>:
> 
>> Hi Cornelis,
>> 
>>> On 24 Aug 2015, at 02:20, Cornelis Hoeflake <c....@postex.com>
>> wrote:
>>> 
>>> Hi,
>>> 
>>> In the before PDFBOX-2842 situation we set the FontProvider on
>>> ExternalFonts to a thread bound font provider (uses ThreadLocal).
>>> This is done because we have a systemen where multiple customers which
>> have
>>> their own fonts. That fonts could also dynamically added to the system at
>>> runtime. We have implemented the FontProvider so that it looks in the
>>> database for a font request.
>> 
>> My first question is: do you really need this? What fonts are your users
>> uploading and
>> why are they missing from PDFs? Could you make them available in some
>> other way?
>> Do the fonts really need to be locked down to a given user? Why not keep
>> it simple and
>> copy the font files to the local system?
>> 
> 
> My customers have licensed (and custom made) fonts. So the first issue is
> that a license for customer 1 is not valid for customer 2.

I guess I’m not quite sure why these custom fonts aren’t embedded in the
PDFs already? You need custom fonts for rendering PDFs? And you expect
these to change regularly?

> The next problem
> is dat the name of custom made fonts does not have to be globally unique.
> The last problem is that the server environment is provisioned on services
> like Amazon Elastic Beanstalk wit autoscaling etc. In that case there is no
> decent option to copy fonts to the system.

You might want to look at something like jimfs, which is an in- memory
filesystem for Java. https://github.com/google/jimfs

>>> In the new situation FontMapper reads the font information once (at
>>> setProvider) and uses this global for the whole system.
>> 
>> The old approach, ExternalFonts used the font name to perform a direct
>> lookup,
>> delegating this to the FontProvider. The new FontMapper performs a best-fit
>> lookup using multiple attributes such as the name, ROS, weight, family,
>> unicode
>> ranges, style, and panose classification. This requires that we first
>> build an index
>> of those attributes for each font.
>> 
> 
> Ok, that is a nice feature. But for my case customers want the correct
> font, not a best-fit.

Then they should be embedding their fonts!

>>> How can we create a situation like we had but then with the new code? I
>> do
>>> not see an option.
>> 
>> Could you explain how you’re currently using ThreadLocal? There might be a
>> workaround. Failing that we could provide a mechanism to allow the font
>> index
>> to be updated dynamically.
>> 
> 
> I have set a custom made 'general' FontProvider. Before doing any operation
> which uses ExternalFonts.getProvider(), I set a customer FontProvider
> (which knows all about the fonts of that customer) in a ThreadLocal in de
> 'general' FontProvider. The 'general' FontProvider delegates each request
> to the customer FontProvider.
> 
> 
>>> I think it is a good idea to drop static FontMapper, FontProvider etc.
>> And
>>> replace it with a given FontProvider/Mapper at start of a document.
>> 
>> Initialising FileSystemFontProvider can take 10 seconds or more, so it’s
>> not practical
>> to create a new one for each document. We could switch to having a
>> per-document
>> FontMapper or FontProvider, with the default being to use a shared static
>> provider,
>> with the user being able to set their own, however we have static APIs
>> which require
>> a FontMapper to exist independently from a document, namely:
>> 
>>        - all PDFont subclass constructors
>>        - PDType1Font constants (TIMES_ROMAN, HELVETICA, COURIER, etc.)
>> 
>> I don’t see any way to make the changes you’re after without breaking all
>> those APIs.
>> 
> 
> Yes, that is true... On the other hand, more and more server software
> products are switching to services like Amazon Elastic Beanstalk etc. And
> last but not least, the API for FontProvider/Mapper and the previous
> ExternalFonts is already broken. Withint 2.0 and between 1.8 and 2.0.

I didn’t mention the API for FontProvider/Mapper... I mentioned the PDFont
constructors and PDType1Font constants. Those are widely used APIs
which depend upon having a static FontMapper.

So, as I said, there’s some really important PDFBox APIs which will break
if we change this. I’d like to see some suggestions for how we’re supposed
to deal with that.

— John

>> — John
>> 
>>> Kind regards,
>>> Cornelis Hoeflake
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
>> For additional commands, e-mail: dev-help@pdfbox.apache.org
>> 
>> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


Re: Font provider since PDFBOX-2842

Posted by Cornelis Hoeflake <c....@postex.com>.
Sorry for my delayed reply, I missed your reply for some reason...

2015-08-24 20:54 GMT+02:00 John Hewson <jo...@jahewson.com>:

> Hi Cornelis,
>
> > On 24 Aug 2015, at 02:20, Cornelis Hoeflake <c....@postex.com>
> wrote:
> >
> > Hi,
> >
> > In the before PDFBOX-2842 situation we set the FontProvider on
> > ExternalFonts to a thread bound font provider (uses ThreadLocal).
> > This is done because we have a systemen where multiple customers which
> have
> > their own fonts. That fonts could also dynamically added to the system at
> > runtime. We have implemented the FontProvider so that it looks in the
> > database for a font request.
>
> My first question is: do you really need this? What fonts are your users
> uploading and
> why are they missing from PDFs? Could you make them available in some
> other way?
> Do the fonts really need to be locked down to a given user? Why not keep
> it simple and
> copy the font files to the local system?
>

My customers have licensed (and custom made) fonts. So the first issue is
that a license for customer 1 is not valid for customer 2. The next problem
is dat the name of custom made fonts does not have to be globally unique.
The last problem is that the server environment is provisioned on services
like Amazon Elastic Beanstalk wit autoscaling etc. In that case there is no
decent option to copy fonts to the system.


> > In the new situation FontMapper reads the font information once (at
> > setProvider) and uses this global for the whole system.
>
> The old approach, ExternalFonts used the font name to perform a direct
> lookup,
> delegating this to the FontProvider. The new FontMapper performs a best-fit
> lookup using multiple attributes such as the name, ROS, weight, family,
> unicode
> ranges, style, and panose classification. This requires that we first
> build an index
> of those attributes for each font.
>

Ok, that is a nice feature. But for my case customers want the correct
font, not a best-fit.


> > How can we create a situation like we had but then with the new code? I
> do
> > not see an option.
>
> Could you explain how you’re currently using ThreadLocal? There might be a
> workaround. Failing that we could provide a mechanism to allow the font
> index
> to be updated dynamically.
>

I have set a custom made 'general' FontProvider. Before doing any operation
which uses ExternalFonts.getProvider(), I set a customer FontProvider
(which knows all about the fonts of that customer) in a ThreadLocal in de
'general' FontProvider. The 'general' FontProvider delegates each request
to the customer FontProvider.


> > I think it is a good idea to drop static FontMapper, FontProvider etc.
> And
> > replace it with a given FontProvider/Mapper at start of a document.
>
> Initialising FileSystemFontProvider can take 10 seconds or more, so it’s
> not practical
> to create a new one for each document. We could switch to having a
> per-document
> FontMapper or FontProvider, with the default being to use a shared static
> provider,
> with the user being able to set their own, however we have static APIs
> which require
> a FontMapper to exist independently from a document, namely:
>
>         - all PDFont subclass constructors
>         - PDType1Font constants (TIMES_ROMAN, HELVETICA, COURIER, etc.)
>
> I don’t see any way to make the changes you’re after without breaking all
> those APIs.
>

Yes, that is true... On the other hand, more and more server software
products are switching to services like Amazon Elastic Beanstalk etc. And
last but not least, the API for FontProvider/Mapper and the previous
ExternalFonts is already broken. Withint 2.0 and between 1.8 and 2.0.


> — John
>
> > Kind regards,
> > Cornelis Hoeflake
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: dev-help@pdfbox.apache.org
>
>

Re: Font provider since PDFBOX-2842

Posted by John Hewson <jo...@jahewson.com>.
Hi Cornelis,

> On 24 Aug 2015, at 02:20, Cornelis Hoeflake <c....@postex.com> wrote:
> 
> Hi,
> 
> In the before PDFBOX-2842 situation we set the FontProvider on
> ExternalFonts to a thread bound font provider (uses ThreadLocal).
> This is done because we have a systemen where multiple customers which have
> their own fonts. That fonts could also dynamically added to the system at
> runtime. We have implemented the FontProvider so that it looks in the
> database for a font request.

My first question is: do you really need this? What fonts are your users uploading and
why are they missing from PDFs? Could you make them available in some other way?
Do the fonts really need to be locked down to a given user? Why not keep it simple and
copy the font files to the local system?

> In the new situation FontMapper reads the font information once (at
> setProvider) and uses this global for the whole system.

The old approach, ExternalFonts used the font name to perform a direct lookup,
delegating this to the FontProvider. The new FontMapper performs a best-fit
lookup using multiple attributes such as the name, ROS, weight, family, unicode
ranges, style, and panose classification. This requires that we first build an index
of those attributes for each font.

> How can we create a situation like we had but then with the new code? I do
> not see an option.

Could you explain how you’re currently using ThreadLocal? There might be a
workaround. Failing that we could provide a mechanism to allow the font index
to be updated dynamically.

> I think it is a good idea to drop static FontMapper, FontProvider etc. And
> replace it with a given FontProvider/Mapper at start of a document.

Initialising FileSystemFontProvider can take 10 seconds or more, so it’s not practical
to create a new one for each document. We could switch to having a per-document
FontMapper or FontProvider, with the default being to use a shared static provider,
with the user being able to set their own, however we have static APIs which require
a FontMapper to exist independently from a document, namely:

	- all PDFont subclass constructors
	- PDType1Font constants (TIMES_ROMAN, HELVETICA, COURIER, etc.)

I don’t see any way to make the changes you’re after without breaking all those APIs.

— John

> Kind regards,
> Cornelis Hoeflake


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org