You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by Jorg Janke <jo...@accorto.com> on 2016/05/01 20:30:34 UTC

Guidance on fonts in 2.0.1 on Linux

Hi guys,

We migrated to 2.0.1 from 1.8 and have some issues with fonts.  We try not
to embed them and trying to use default fonts.
Could you please check
http://stackoverflow.com/questions/36970339/pdfbox-2-0-1-how-to-use-standard-fonts-on-amazon-linux

Cheers,
Jorg


Jorg Janke - www.accorto.com - (650) 227-3271

Re: Guidance on fonts in 2.0.1 on Linux

Posted by John Hewson <jo...@jahewson.com>.
> On 3 May 2016, at 12:03, Tres Finocchiaro <tr...@gmail.com> wrote:
> 
> Does 2.0 still support FontMapping.properties?

No, that has been removed. PDFBox 2.0 is much smarter about mapping fonts, we
discourage custom font mapping because you’re going to get different results from
what people using equivalent systems will see.

Note that the error “… is not available in this font's encoding” mean that the PDF
is broken, no amount of font mapping will help you there.

— John

> Besides the unicode limitations you've spoken of, I had great success doing
> this with 1.8 on Ubuntu to get better support for the free alternative
> fonts (ended up recompiling myself just to get the file embedded into the
> JAR).
> 
> I've linked my sample property file in the comments of the aforementioned
> StackOverflow article for those interested.
> 
> 
> 
> - Tres.Finocchiaro@gmail.com
> 
> On Tue, May 3, 2016 at 2:54 PM, John Hewson <jo...@jahewson.com> wrote:
> 
>> Hi Jorg,
>> 
>> Firstly, I’d recommend embedding TTF fonts in 2.0, using PDType0Font. It
>> works really well and supports Unicode.
>>> We migrated to PDFBox 2.0.1 from 1.8 and have some issues with fonts. We
>> try not to embed them and trying to use default fonts if possible.
>>> 
>>> That worked well in 1.8 but in 2.0.1 we get some errors when running on
>> Amazon Linux - e.g.
>>> 
>>> PDType1Font.: Using fallback font LiberationSans for base font
>> Times-Roman
>>> U+00B7 ('middot') is not available in this font's encoding:
>> WinAnsiEncoding
>> Secondly, to answer your SO question, 2.0 is stricter about what
>> characters you can use with the built-in fonts, because they are the PDF
>> spec defines them as Type1 fonts which don’t support Unicode. This is why
>> you’re seeing an error about WinAnsiEncoding, because that’s the encoding
>> of Times-Roman in the PDF spec and U+00B7 does not exist in that encoding.
>> So even though the character is available in the font, you can’t use it.
>> This was permitted in 1.8 but was technically creating “bad” PDF files.
>> 
>> The solution is to either create your own DictionaryEncoding and make the
>> PDFont use that (experts only, and may not produce consistent PDFs), or to
>> use PDType0Font which supports Unicode.
>> 
>> — John
>> 
>>> On 1 May 2016, at 11:30, Jorg Janke <jo...@accorto.com> wrote:
>>> 
>>> Hi guys,
>>> 
>>> We migrated to 2.0.1 from 1.8 and have some issues with fonts.  We try
>> not
>>> to embed them and trying to use default fonts.
>>> Could you please check
>>> 
>> http://stackoverflow.com/questions/36970339/pdfbox-2-0-1-how-to-use-standard-fonts-on-amazon-linux
>>> 
>>> Cheers,
>>> Jorg
>>> 
>>> 
>>> Jorg Janke - www.accorto.com - (650) 227-3271
>> 
>> 


Re: Guidance on fonts in 2.0.1 on Linux

Posted by Tres Finocchiaro <tr...@gmail.com>.
Does 2.0 still support FontMapping.properties?

Besides the unicode limitations you've spoken of, I had great success doing
this with 1.8 on Ubuntu to get better support for the free alternative
fonts (ended up recompiling myself just to get the file embedded into the
JAR).

I've linked my sample property file in the comments of the aforementioned
StackOverflow article for those interested.



- Tres.Finocchiaro@gmail.com

On Tue, May 3, 2016 at 2:54 PM, John Hewson <jo...@jahewson.com> wrote:

> Hi Jorg,
>
> Firstly, I’d recommend embedding TTF fonts in 2.0, using PDType0Font. It
> works really well and supports Unicode.
> > We migrated to PDFBox 2.0.1 from 1.8 and have some issues with fonts. We
> try not to embed them and trying to use default fonts if possible.
> >
> > That worked well in 1.8 but in 2.0.1 we get some errors when running on
> Amazon Linux - e.g.
> >
> > PDType1Font.: Using fallback font LiberationSans for base font
> Times-Roman
> > U+00B7 ('middot') is not available in this font's encoding:
> WinAnsiEncoding
> Secondly, to answer your SO question, 2.0 is stricter about what
> characters you can use with the built-in fonts, because they are the PDF
> spec defines them as Type1 fonts which don’t support Unicode. This is why
> you’re seeing an error about WinAnsiEncoding, because that’s the encoding
> of Times-Roman in the PDF spec and U+00B7 does not exist in that encoding.
> So even though the character is available in the font, you can’t use it.
> This was permitted in 1.8 but was technically creating “bad” PDF files.
>
> The solution is to either create your own DictionaryEncoding and make the
> PDFont use that (experts only, and may not produce consistent PDFs), or to
> use PDType0Font which supports Unicode.
>
> — John
>
> > On 1 May 2016, at 11:30, Jorg Janke <jo...@accorto.com> wrote:
> >
> > Hi guys,
> >
> > We migrated to 2.0.1 from 1.8 and have some issues with fonts.  We try
> not
> > to embed them and trying to use default fonts.
> > Could you please check
> >
> http://stackoverflow.com/questions/36970339/pdfbox-2-0-1-how-to-use-standard-fonts-on-amazon-linux
> >
> > Cheers,
> > Jorg
> >
> >
> > Jorg Janke - www.accorto.com - (650) 227-3271
>
>

Re: Guidance on fonts in 2.0.1 on Linux

Posted by John Hewson <jo...@jahewson.com>.
Hi Jorg,

Firstly, I’d recommend embedding TTF fonts in 2.0, using PDType0Font. It works really well and supports Unicode.
> We migrated to PDFBox 2.0.1 from 1.8 and have some issues with fonts. We try not to embed them and trying to use default fonts if possible.
> 
> That worked well in 1.8 but in 2.0.1 we get some errors when running on Amazon Linux - e.g.
> 
> PDType1Font.: Using fallback font LiberationSans for base font Times-Roman
> U+00B7 ('middot') is not available in this font's encoding: WinAnsiEncoding
Secondly, to answer your SO question, 2.0 is stricter about what characters you can use with the built-in fonts, because they are the PDF spec defines them as Type1 fonts which don’t support Unicode. This is why you’re seeing an error about WinAnsiEncoding, because that’s the encoding of Times-Roman in the PDF spec and U+00B7 does not exist in that encoding. So even though the character is available in the font, you can’t use it. This was permitted in 1.8 but was technically creating “bad” PDF files.

The solution is to either create your own DictionaryEncoding and make the PDFont use that (experts only, and may not produce consistent PDFs), or to use PDType0Font which supports Unicode.

— John

> On 1 May 2016, at 11:30, Jorg Janke <jo...@accorto.com> wrote:
> 
> Hi guys,
> 
> We migrated to 2.0.1 from 1.8 and have some issues with fonts.  We try not
> to embed them and trying to use default fonts.
> Could you please check
> http://stackoverflow.com/questions/36970339/pdfbox-2-0-1-how-to-use-standard-fonts-on-amazon-linux
> 
> Cheers,
> Jorg
> 
> 
> Jorg Janke - www.accorto.com - (650) 227-3271