You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by Christian Schmitt <c....@briefdomain.de> on 2015/12/08 17:59:13 UTC

Embed font's without writing Text (providing unembeded fonts)

Hello, i wanted to ask if there is a way with PDFBox to embed fonts without providing text.

Currently the most example’s are really simple, however they require me to add a text to the pdf.
But currently I want to open a existing PDF and Check all the font’s if they are embedded. (that’s really easy, too).

And now I want to embed them if they are not (I mostly want to do this for a PDF/A conversion)

Currently the example’s I looked at: https://svn.apache.org/viewvc/pdfbox/trunk/examples/src/main/java/org/apache/pdfbox/examples/pdmodel/EmbeddedFonts.java?view=markup
And I also looked at: https://svn.apache.org/viewvc/pdfbox/trunk/examples/src/main/java/org/apache/pdfbox/examples/pdmodel/EmbeddedFiles.java?view=markup

I mean with the EmbeddedFiles example I could easily embed files. However does this apply to font,s too? Where are the fonts stored inside a PDF.
I also had an example to look at all the fonts inside the PDFs but I don’t find it.

My currently solution creates a PDF/A from a standard PDF, but it won’t conform the standard.

RE: Embed font's without writing Text (providing unembeded fonts)

Posted by Donnie <da...@hotmail.com>.
 
Subject: Re: Embed font's without writing Text (providing unembeded fonts)
To: users@pdfbox.apache.org
From: THausherr@t-online.de
Date: Wed, 9 Dec 2015 18:31:00 +0100


  
    
  
  
    Am 09.12.2015 um 15:19 schrieb
      Christian Schmitt:

    
    
      
      I know how to extract them, however adding them seems to be the
      „bigger“ problem, since mostly I can’t just embed them with a
      command. The problem relies on where I should put the font file,
      currently I have a folder with all the necessary fonts and could
      create font objects with that and also I know which fonts are
      missing inside my pdf, but now the problem relies on attach the
      file i.e. embed the font.

    
    

    You need to understand the COS data types (COSDictionary, COSStream,
    COSArray, COSBase etc), then insert a COSStream (that has your font
    file) at the correct place in the font descriptor (type
    COSDictionary) with the correct key. The key is FontFile1, FontFile2
    or FontFile3 depending on the type of font.

    

    Tilman

    

    
      

      
      

        
          Beste Grüße / Best Regards
           
          
          
            
            
                                   
                             
            Christian Schmitt
            Entwickler / Developer
            
            
          
        

        
          
            Am 08.12.2015 um 18:49 schrieb Tilman Hausherr
              <TH...@t-online.de>:
            
            PDFDebugger
          
        
        

      
    
    
 		 	   		  

Re: Embed font's without writing Text (providing unembeded fonts)

Posted by Christian Schmitt <c....@envisia.de>.
These products aren’t as customizable as I wish and also I want to learn more about PDF’s aswell. Also If I have some working code I could share it.
Beste Grüße / Best Regards

Christian Schmitt
Entwickler / Developer



> Am 09.12.2015 um 22:28 schrieb Tilman Hausherr <TH...@t-online.de>:
> 
> Am 09.12.2015 um 22:01 schrieb Christian Schmitt:
>> Hello,
>> 
>> thanks for your help I already looked a little bit deeper, after I tried to use PDResources which could add font’s but if I add the font in every page the file gets too big and if I only add the font at the first page it won’t validate correctly.
> 
> If it is the same font, you can add the same COSStream object, i.e. you don't have to create a new one.
> 
> About the size - that's why subsets are used. But for that, you would have to analyse each content stream to see which glyphs are used.
> 
> It's probably cheaper to buy a product like callas pdfaPilot.
> https://secure.callassoftware.com/buy_per_web/cls_formBPW_de.php?product=610
> 
> Tilman
> 
>> After that I looked at COS but I’m not that far yet, still thanks after looking inside the source of pdfbox I will definitely find out more things.
>> 
>> 
>>> Am 09.12.2015 um 18:31 schrieb Tilman Hausherr <THausherr@t-online.de <ma...@t-online.de>>:
>>> 
>>> Am 09.12.2015 um 15:19 schrieb Christian Schmitt:
>>>> I know how to extract them, however adding them seems to be the „bigger“ problem, since mostly I can’t just embed them with a command. The problem relies on where I should put the font file, currently I have a folder with all the necessary fonts and could create font objects with that and also I know which fonts are missing inside my pdf, but now the problem relies on attach the file i.e. embed the font.
>>> 
>>> You need to understand the COS data types (COSDictionary, COSStream, COSArray, COSBase etc), then insert a COSStream (that has your font file) at the correct place in the font descriptor (type COSDictionary) with the correct key. The key is FontFile1, FontFile2 or FontFile3 depending on the type of font.
>>> 
>>> Tilman
>>> 
>>>> 
>>>> 
>>>> Beste Grüße / Best Regards
>>>> 
>>>> 
>>>> <Mail-Anhang.png>
>>>> 
>>>> 
>>>> 
>>>> Christian Schmitt
>>>> Entwickler / Developer
>>>> 
>>>> 
>>>> 
>>>>> Am 08.12.2015 um 18:49 schrieb Tilman Hausherr <THausherr@t-online.de <ma...@t-online.de>>:
>>>>> 
>>>>> PDFDebugger
>>>> 
>>> 
>> 
> 


Re: Embed font's without writing Text (providing unembeded fonts)

Posted by Tilman Hausherr <TH...@t-online.de>.
Am 09.12.2015 um 22:01 schrieb Christian Schmitt:
> Hello,
>
> thanks for your help I already looked a little bit deeper, after I 
> tried to use PDResources which could add font’s but if I add the font 
> in every page the file gets too big and if I only add the font at the 
> first page it won’t validate correctly.

If it is the same font, you can add the same COSStream object, i.e. you 
don't have to create a new one.

About the size - that's why subsets are used. But for that, you would 
have to analyse each content stream to see which glyphs are used.

It's probably cheaper to buy a product like callas pdfaPilot.
https://secure.callassoftware.com/buy_per_web/cls_formBPW_de.php?product=610

Tilman

> After that I looked at COS but I’m not that far yet, still thanks 
> after looking inside the source of pdfbox I will definitely find out 
> more things.
>
>
>> Am 09.12.2015 um 18:31 schrieb Tilman Hausherr <THausherr@t-online.de 
>> <ma...@t-online.de>>:
>>
>> Am 09.12.2015 um 15:19 schrieb Christian Schmitt:
>>> I know how to extract them, however adding them seems to be the 
>>> „bigger“ problem, since mostly I can’t just embed them with a 
>>> command. The problem relies on where I should put the font file, 
>>> currently I have a folder with all the necessary fonts and could 
>>> create font objects with that and also I know which fonts are 
>>> missing inside my pdf, but now the problem relies on attach the file 
>>> i.e. embed the font.
>>
>> You need to understand the COS data types (COSDictionary, COSStream, 
>> COSArray, COSBase etc), then insert a COSStream (that has your font 
>> file) at the correct place in the font descriptor (type 
>> COSDictionary) with the correct key. The key is FontFile1, FontFile2 
>> or FontFile3 depending on the type of font.
>>
>> Tilman
>>
>>>
>>>
>>> Beste Grüße / Best Regards
>>>
>>>
>>> <Mail-Anhang.png>
>>>
>>>
>>>
>>> Christian Schmitt
>>> Entwickler / Developer
>>>
>>>
>>>
>>>> Am 08.12.2015 um 18:49 schrieb Tilman Hausherr 
>>>> <THausherr@t-online.de <ma...@t-online.de>>:
>>>>
>>>> PDFDebugger
>>>
>>
>


Re: Embed font's without writing Text (providing unembeded fonts)

Posted by Christian Schmitt <c....@envisia.de>.
Hello,

thanks for your help I already looked a little bit deeper, after I tried to use PDResources which could add font’s but if I add the font in every page the file gets too big and if I only add the font at the first page it won’t validate correctly.
After that I looked at COS but I’m not that far yet, still thanks after looking inside the source of pdfbox I will definitely find out more things.


> Am 09.12.2015 um 18:31 schrieb Tilman Hausherr <TH...@t-online.de>:
> 
> Am 09.12.2015 um 15:19 schrieb Christian Schmitt:
>> I know how to extract them, however adding them seems to be the „bigger“ problem, since mostly I can’t just embed them with a command. The problem relies on where I should put the font file, currently I have a folder with all the necessary fonts and could create font objects with that and also I know which fonts are missing inside my pdf, but now the problem relies on attach the file i.e. embed the font.
> 
> You need to understand the COS data types (COSDictionary, COSStream, COSArray, COSBase etc), then insert a COSStream (that has your font file) at the correct place in the font descriptor (type COSDictionary) with the correct key. The key is FontFile1, FontFile2 or FontFile3 depending on the type of font.
> 
> Tilman
> 
>> 
>> 
>> Beste Grüße / Best Regards
>> 
>> 
>> <Mail-Anhang.png>
>> 
>> 
>> 
>> Christian Schmitt
>> Entwickler / Developer
>> 
>> 
>> 
>>> Am 08.12.2015 um 18:49 schrieb Tilman Hausherr <THausherr@t-online.de <ma...@t-online.de>>:
>>> 
>>> PDFDebugger
>> 
> 


Re: Embed font's without writing Text (providing unembeded fonts)

Posted by Tilman Hausherr <TH...@t-online.de>.
Am 09.12.2015 um 15:19 schrieb Christian Schmitt:
> I know how to extract them, however adding them seems to be the 
> „bigger“ problem, since mostly I can’t just embed them with a command. 
> The problem relies on where I should put the font file, currently I 
> have a folder with all the necessary fonts and could create font 
> objects with that and also I know which fonts are missing inside my 
> pdf, but now the problem relies on attach the file i.e. embed the font.

You need to understand the COS data types (COSDictionary, COSStream, 
COSArray, COSBase etc), then insert a COSStream (that has your font 
file) at the correct place in the font descriptor (type COSDictionary) 
with the correct key. The key is FontFile1, FontFile2 or FontFile3 
depending on the type of font.

Tilman

>
>
> Beste Grüße / Best Regards
>
>
>
>
>
>
> Christian Schmitt
> Entwickler / Developer
>
>
>
>> Am 08.12.2015 um 18:49 schrieb Tilman Hausherr <THausherr@t-online.de 
>> <ma...@t-online.de>>:
>>
>> PDFDebugger
>


Re: Embed font's without writing Text (providing unembeded fonts)

Posted by Christian Schmitt <c....@envisia.de>.
I know how to extract them, however adding them seems to be the „bigger“ problem, since mostly I can’t just embed them with a command. The problem relies on where I should put the font file, currently I have a folder with all the necessary fonts and could create font objects with that and also I know which fonts are missing inside my pdf, but now the problem relies on attach the file i.e. embed the font.


Beste Grüße / Best Regards






Christian Schmitt
Entwickler / Developer



> Am 08.12.2015 um 18:49 schrieb Tilman Hausherr <TH...@t-online.de>:
> 
> PDFDebugger


Re: Embed font's without writing Text (providing unembeded fonts)

Posted by Tilman Hausherr <TH...@t-online.de>.
Am 08.12.2015 um 18:39 schrieb Christian Schmitt:
> I know that however first I search for a solution on the font problem. ;)

You'd need to read about the different font types, and then integrate 
the actual font into the PDF file. Of course you must have the fonts in 
the proper type. Or replace the font descriptor. Probably work for 
several weeks.

Btw about your question - the fonts are in the resource dictionaries of 
pages, XObject forms, patterns, sometimes also in annotations and 
fields. Download PDFDebugger and then look at the pages of your PDF.

Tilman

>
>
>
>> Am 08.12.2015 um 18:30 schrieb Tilman Hausherr <THausherr@t-online.de 
>> <ma...@t-online.de>>:
>>
>> Am 08.12.2015 um 17:59 schrieb Christian Schmitt:
>>> Hello, i wanted to ask if there is a way with PDFBox to embed fonts 
>>> without providing text.
>>>
>>> Currently the most example’s are really simple, however they require 
>>> me to add a text to the pdf.
>>> But currently I want to open a existing PDF and Check all the font’s 
>>> if they are embedded. (that’s really easy, too).
>>>
>>> And now I want to embed them if they are not (I mostly want to do 
>>> this for a PDF/A conversion)
>>>
>>> Currently the example’s I looked at: 
>>> https://svn.apache.org/viewvc/pdfbox/trunk/examples/src/main/java/org/apache/pdfbox/examples/pdmodel/EmbeddedFonts.java?view=markup
>>> And I also looked at: 
>>> https://svn.apache.org/viewvc/pdfbox/trunk/examples/src/main/java/org/apache/pdfbox/examples/pdmodel/EmbeddedFiles.java?view=markup
>>>
>>> I mean with the EmbeddedFiles example I could easily embed files. 
>>> However does this apply to font,s too? Where are the fonts stored 
>>> inside a PDF.
>>> I also had an example to look at all the fonts inside the PDFs but I 
>>> don’t find it.
>>>
>>> My currently solution creates a PDF/A from a standard PDF, but it 
>>> won’t conform the standard.
>>
>> There are many possible problems with ordinary PDF files, it isn't 
>> just about fonts - just check your file with PDFBox preflight. So 
>> even if you manage to fix the non embedded fonts, you'll still have 
>> many other problems.
>>
>> Tilman
>


Re: Embed font's without writing Text (providing unembeded fonts)

Posted by Christian Schmitt <c....@envisia.de>.
I know that however first I search for a solution on the font problem. ;)



> Am 08.12.2015 um 18:30 schrieb Tilman Hausherr <TH...@t-online.de>:
> 
> Am 08.12.2015 um 17:59 schrieb Christian Schmitt:
>> Hello, i wanted to ask if there is a way with PDFBox to embed fonts without providing text.
>> 
>> Currently the most example’s are really simple, however they require me to add a text to the pdf.
>> But currently I want to open a existing PDF and Check all the font’s if they are embedded. (that’s really easy, too).
>> 
>> And now I want to embed them if they are not (I mostly want to do this for a PDF/A conversion)
>> 
>> Currently the example’s I looked at: https://svn.apache.org/viewvc/pdfbox/trunk/examples/src/main/java/org/apache/pdfbox/examples/pdmodel/EmbeddedFonts.java?view=markup
>> And I also looked at: https://svn.apache.org/viewvc/pdfbox/trunk/examples/src/main/java/org/apache/pdfbox/examples/pdmodel/EmbeddedFiles.java?view=markup
>> 
>> I mean with the EmbeddedFiles example I could easily embed files. However does this apply to font,s too? Where are the fonts stored inside a PDF.
>> I also had an example to look at all the fonts inside the PDFs but I don’t find it.
>> 
>> My currently solution creates a PDF/A from a standard PDF, but it won’t conform the standard.
> 
> There are many possible problems with ordinary PDF files, it isn't just about fonts - just check your file with PDFBox preflight. So even if you manage to fix the non embedded fonts, you'll still have many other problems.
> 
> Tilman


Re: Embed font's without writing Text (providing unembeded fonts)

Posted by Tilman Hausherr <TH...@t-online.de>.
Am 08.12.2015 um 17:59 schrieb Christian Schmitt:
> Hello, i wanted to ask if there is a way with PDFBox to embed fonts 
> without providing text.
>
> Currently the most example’s are really simple, however they require 
> me to add a text to the pdf.
> But currently I want to open a existing PDF and Check all the font’s 
> if they are embedded. (that’s really easy, too).
>
> And now I want to embed them if they are not (I mostly want to do this 
> for a PDF/A conversion)
>
> Currently the example’s I looked at: 
> https://svn.apache.org/viewvc/pdfbox/trunk/examples/src/main/java/org/apache/pdfbox/examples/pdmodel/EmbeddedFonts.java?view=markup
> And I also looked at: 
> https://svn.apache.org/viewvc/pdfbox/trunk/examples/src/main/java/org/apache/pdfbox/examples/pdmodel/EmbeddedFiles.java?view=markup
>
> I mean with the EmbeddedFiles example I could easily embed files. 
> However does this apply to font,s too? Where are the fonts stored 
> inside a PDF.
> I also had an example to look at all the fonts inside the PDFs but I 
> don’t find it.
>
> My currently solution creates a PDF/A from a standard PDF, but it 
> won’t conform the standard.

There are many possible problems with ordinary PDF files, it isn't just 
about fonts - just check your file with PDFBox preflight. So even if you 
manage to fix the non embedded fonts, you'll still have many other 
problems.

Tilman

Re: Embed font's without writing Text (providing unembeded fonts)

Posted by John Hewson <jo...@jahewson.com>.
Hi,

One trick for embedding missing fonts would be to create a new PDFont using a method such as PDTrueTypeFont.load and then copying over any COS entries which are missing in the embedded font. You’ll need to repeat this for each font subtype.

But be warned that being PDF/A compliant involves a lot more than just embedding fonts, in general there’s not much chance that you’re going to be able to write the code to handle this - it would be a major contribution to PDFBox if you did!

— John

> On 8 Dec 2015, at 08:59, Christian Schmitt <c....@briefdomain.de> wrote:
> 
> Hello, i wanted to ask if there is a way with PDFBox to embed fonts without providing text.
> 
> Currently the most example’s are really simple, however they require me to add a text to the pdf.
> But currently I want to open a existing PDF and Check all the font’s if they are embedded. (that’s really easy, too).
> 
> And now I want to embed them if they are not (I mostly want to do this for a PDF/A conversion)
> 
> Currently the example’s I looked at: https://svn.apache.org/viewvc/pdfbox/trunk/examples/src/main/java/org/apache/pdfbox/examples/pdmodel/EmbeddedFonts.java?view=markup <https://svn.apache.org/viewvc/pdfbox/trunk/examples/src/main/java/org/apache/pdfbox/examples/pdmodel/EmbeddedFonts.java?view=markup>
> And I also looked at: https://svn.apache.org/viewvc/pdfbox/trunk/examples/src/main/java/org/apache/pdfbox/examples/pdmodel/EmbeddedFiles.java?view=markup <https://svn.apache.org/viewvc/pdfbox/trunk/examples/src/main/java/org/apache/pdfbox/examples/pdmodel/EmbeddedFiles.java?view=markup>
> 
> I mean with the EmbeddedFiles example I could easily embed files. However does this apply to font,s too? Where are the fonts stored inside a PDF.
> I also had an example to look at all the fonts inside the PDFs but I don’t find it.
> 
> My currently solution creates a PDF/A from a standard PDF, but it won’t conform the standard.
> <signature.asc>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org