You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by db...@bergqvist.se on 2015/12/28 04:40:48 UTC

How do I create a valid PDF/A document with PDFBOX 2.0?

Hi,

I have read this document about creating a valid PDF/A document:

https://pdfbox.apache.org/1.8/cookbook/pdfacreation.html

The problem is that it is written for PDFBOX 1.8 and I cannot get it to 
work for PDFBOX 2.0. I have searched the example folder but haven't been 
able to find an example or any other documentation on how to create a 
valid PDF/A document.

Or is it simply that PDFBOX 2.0 is not finished yet and that I should 
use PDFBOX 1.8 instead?

Kind regards,
Daniel

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Re: How do I create a valid PDF/A document with PDFBOX 2.0?

Posted by Tilman Hausherr <TH...@t-online.de>.
Am 28.12.2015 um 23:56 schrieb Nakul Malhotra:
> what is the best way to go about learning how to use pdfBox?
>

The best is to look at the examples, find the one the most similar to 
what you want to do. There's no "PDFBox in action" book or tutorial.

If you want to do advanced things, you'll need to read the PDF 
specification.
https://www.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/PDF32000_2008.pdf

Re PDF/A, don't forget that after creating a PDF, you should check it 
with PDFBox Preflight. The blue bar in adobe reader only means that it 
claims to be PDF/A, not that it actually is one.

Tilman

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Re: How do I create a valid PDF/A document with PDFBOX 2.0?

Posted by Nakul Malhotra <ni...@gmail.com>.
what is the best way to go about learning how to use pdfBox?

Re: How do I create a valid PDF/A document with PDFBOX 2.0?

Posted by db...@bergqvist.se.
This code loads the pdf "test.pdf" and saves every page as a PNG file.

If you want to show the page on screen you could use the method 
pdfRenderer.renderPageToGraphics() instead.



public void extractPages() throws IOException
{
     String pdfFilename = "test";

     PDDocument document = PDDocument.load(new File(pdfFilename+".pdf"));
     PDFRenderer pdfRenderer = new PDFRenderer(document);
     for (int pageNo=0; pageNo < document.getNumberOfPages(); pageNo++)
     {
         BufferedImage bim = pdfRenderer.renderImageWithDPI(pageNo, 300, 
ImageType.RGB);

         // suffix in filename will be used as the file format
         ImageIOUtil.writeImage(bim, pdfFilename + "-" + (pageNo++) + 
".png", 300);
     }
     document.close();
}



Kind regards
Daniel



2015-12-29 02:02 skrev Nakul Malhotra:
> How do you use pdfbox to extract?!!?
> 
> On Mon, Dec 28, 2015 at 8:02 PM, <db...@bergqvist.se> wrote:
> 
>> Thanks!
>> 
>> 
>> 
>> 2015-12-28 08:44 skrev Tilman Hausherr:
>> 
>>> Am 28.12.2015 um 04:40 schrieb db123@bergqvist.se:
>>> 
>>>> Hi,
>>>> 
>>>> I have read this document about creating a valid PDF/A document:
>>>> 
>>>> https://pdfbox.apache.org/1.8/cookbook/pdfacreation.html
>>>> 
>>>> The problem is that it is written for PDFBOX 1.8 and I cannot get it 
>>>> to
>>>> work for PDFBOX 2.0. I have searched the example folder but haven't 
>>>> been
>>>> able to find an example or any other documentation on how to create 
>>>> a valid
>>>> PDF/A document.
>>>> 
>>> 
>>> It is in
>>> examples\src\main\java\org\apache\pdfbox\examples\pdmodel\CreatePDFA.java
>>> in the ZIP file:
>>> 
>>> ....
>>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>> For additional commands, e-mail: users-help@pdfbox.apache.org
>> 
>> 

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Re: How do I create a valid PDF/A document with PDFBOX 2.0?

Posted by Tilman Hausherr <TH...@t-online.de>.
Am 29.12.2015 um 02:02 schrieb Nakul Malhotra:
> How do you use pdfbox to extract?!!?

https://pdfbox.apache.org/1.8/cookbook/textextraction.html

|PDFTextStripper stripper = new PDFTextStripper(); stripper.setStartPage( 
2 ); stripper.setEndPage( 3 ); stripper.getText( document ); |


Re: How do I create a valid PDF/A document with PDFBOX 2.0?

Posted by Nakul Malhotra <ni...@gmail.com>.
How do you use pdfbox to extract?!!?

On Mon, Dec 28, 2015 at 8:02 PM, <db...@bergqvist.se> wrote:

> Thanks!
>
>
>
> 2015-12-28 08:44 skrev Tilman Hausherr:
>
>> Am 28.12.2015 um 04:40 schrieb db123@bergqvist.se:
>>
>>> Hi,
>>>
>>> I have read this document about creating a valid PDF/A document:
>>>
>>> https://pdfbox.apache.org/1.8/cookbook/pdfacreation.html
>>>
>>> The problem is that it is written for PDFBOX 1.8 and I cannot get it to
>>> work for PDFBOX 2.0. I have searched the example folder but haven't been
>>> able to find an example or any other documentation on how to create a valid
>>> PDF/A document.
>>>
>>
>> It is in
>> examples\src\main\java\org\apache\pdfbox\examples\pdmodel\CreatePDFA.java
>> in the ZIP file:
>>
>> ....
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
>
>

Re: How do I create a valid PDF/A document with PDFBOX 2.0?

Posted by db...@bergqvist.se.
Thanks!



2015-12-28 08:44 skrev Tilman Hausherr:
> Am 28.12.2015 um 04:40 schrieb db123@bergqvist.se:
>> Hi,
>> 
>> I have read this document about creating a valid PDF/A document:
>> 
>> https://pdfbox.apache.org/1.8/cookbook/pdfacreation.html
>> 
>> The problem is that it is written for PDFBOX 1.8 and I cannot get it 
>> to work for PDFBOX 2.0. I have searched the example folder but haven't 
>> been able to find an example or any other documentation on how to 
>> create a valid PDF/A document.
> 
> It is in
> examples\src\main\java\org\apache\pdfbox\examples\pdmodel\CreatePDFA.java
> in the ZIP file:
> 
> ....

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Re: How do I create a valid PDF/A document with PDFBOX 2.0?

Posted by Tilman Hausherr <TH...@t-online.de>.
Am 28.12.2015 um 04:40 schrieb db123@bergqvist.se:
> Hi,
>
> I have read this document about creating a valid PDF/A document:
>
> https://pdfbox.apache.org/1.8/cookbook/pdfacreation.html
>
> The problem is that it is written for PDFBOX 1.8 and I cannot get it 
> to work for PDFBOX 2.0. I have searched the example folder but haven't 
> been able to find an example or any other documentation on how to 
> create a valid PDF/A document.

It is in 
examples\src\main\java\org\apache\pdfbox\examples\pdmodel\CreatePDFA.java in 
the ZIP file:

/*
  * Licensed to the Apache Software Foundation (ASF) under one or more
  * contributor license agreements.  See the NOTICE file distributed with
  * this work for additional information regarding copyright ownership.
  * The ASF licenses this file to You under the Apache License, Version 2.0
  * (the "License"); you may not use this file except in compliance with
  * the License.  You may obtain a copy of the License at
  *
  *      http://www.apache.org/licenses/LICENSE-2.0
  *
  * Unless required by applicable law or agreed to in writing, software
  * distributed under the License is distributed on an "AS IS" BASIS,
  * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  * See the License for the specific language governing permissions and
  * limitations under the License.
  */
package org.apache.pdfbox.examples.pdmodel;

import java.io.ByteArrayOutputStream;
import java.io.File;
import java.io.IOException;
import java.io.InputStream;
import javax.xml.transform.TransformerException;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDPage;
import org.apache.pdfbox.pdmodel.PDPageContentStream;
import org.apache.pdfbox.pdmodel.common.PDMetadata;
import org.apache.pdfbox.pdmodel.font.PDFont;
import org.apache.pdfbox.pdmodel.font.PDType0Font;
import org.apache.pdfbox.pdmodel.graphics.color.PDOutputIntent;
import org.apache.xmpbox.XMPMetadata;
import org.apache.xmpbox.schema.DublinCoreSchema;
import org.apache.xmpbox.schema.PDFAIdentificationSchema;
import org.apache.xmpbox.type.BadFieldValueException;
import org.apache.xmpbox.xml.XmpSerializer;

/**
  * Creates a simple PDF/A document.
  */
public final class CreatePDFA
{
     private CreatePDFA()
     {
     }

     public static void main(String[] args) throws IOException, 
TransformerException
     {
         if (args.length != 3)
         {
             System.err.println("usage: " + CreatePDFA.class.getName() +
                     " <output-file> <Message> <ttf-file>");
             System.exit(1);
         }

         String file = args[0];
         String message = args[1];
         String fontfile = args[2];

         PDDocument doc = new PDDocument();
         try
         {
             PDPage page = new PDPage();
             doc.addPage(page);

             // load the font as this needs to be embedded
             PDFont font = PDType0Font.load(doc, new File(fontfile));

             // create a page with the message
             PDPageContentStream contents = new PDPageContentStream(doc, 
page);
             contents.beginText();
             contents.setFont(font, 12);
             contents.newLineAtOffset(100, 700);
             contents.showText(message);
             contents.endText();
             contents.saveGraphicsState();
             contents.close();

             // add XMP metadata
             XMPMetadata xmp = XMPMetadata.createXMPMetadata();

             try
             {
                 DublinCoreSchema dc = xmp.createAndAddDublinCoreSchema();
                 dc.setTitle(file);

                 PDFAIdentificationSchema id = 
xmp.createAndAddPFAIdentificationSchema();
                 id.setPart(1);
                 id.setConformance("B");

                 XmpSerializer serializer = new XmpSerializer();
                 ByteArrayOutputStream baos = new ByteArrayOutputStream();
                 serializer.serialize(xmp, baos, true);

                 PDMetadata metadata = new PDMetadata(doc);
                 metadata.importXMPMetadata(baos.toByteArray());
                 doc.getDocumentCatalog().setMetadata(metadata);
             }
             catch(BadFieldValueException e)
             {
                 // won't happen here, as the provided value is valid
                 throw new IllegalArgumentException(e);
             }

             // sRGB output intent
             InputStream colorProfile = 
CreatePDFA.class.getResourceAsStream(
                     "/org/apache/pdfbox/resources/pdfa/sRGB Color Space 
Profile.icm");
             PDOutputIntent intent = new PDOutputIntent(doc, colorProfile);
             intent.setInfo("sRGB IEC61966-2.1");
             intent.setOutputCondition("sRGB IEC61966-2.1");
             intent.setOutputConditionIdentifier("sRGB IEC61966-2.1");
             intent.setRegistryName("http://www.color.org");
             doc.getDocumentCatalog().addOutputIntent(intent);

             doc.save(file);
         }
         finally
         {
             doc.close();
         }
     }
}


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org