You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by Tilman Hausherr <TH...@t-online.de> on 2023/12/16 10:02:51 UTC

Re: PDF to PDF/A conversion on java

On 21.11.2023 11:31, Kirandas vakkil wrote:
> Hi All,
>
> Can you please share if there is any resource on converting EXISTING PDF to
> PDF/A in java.

There are commercial tools for this. PDFBox doesn't offer anything, 
however you can still do it if there are very few errors and you know 
how to fix them, and all files are from the same source. This is usually 
true for files from scanners. There you usually only have to add an 
output intent and the correct metadata.

Tilman



>
> This will be of great help to me. Thanks in advance.
>
> Regards,
> Apache Patron
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Re: PDF to PDF/A conversion on java

Posted by Tilman Hausherr <TH...@t-online.de>.
On 19.12.2023 00:24, CowwoC wrote:
> I'm going to need to do something like this in the near future. Are there
> any good samples or documentation I can look at for this use-use?




import java.io.ByteArrayOutputStream;
import java.io.File;
import java.io.IOException;
import java.io.InputStream;
import javax.xml.transform.TransformerException;
import org.apache.pdfbox.Loader;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDDocumentInformation;
import org.apache.pdfbox.pdmodel.common.PDMetadata;
import org.apache.pdfbox.pdmodel.graphics.color.PDOutputIntent;
import org.apache.xmpbox.XMPMetadata;
import org.apache.xmpbox.schema.DublinCoreSchema;
import org.apache.xmpbox.schema.PDFAIdentificationSchema;
import org.apache.xmpbox.schema.XMPBasicSchema;
import org.apache.xmpbox.type.BadFieldValueException;
import org.apache.xmpbox.xml.XmpSerializer;


public final class ConvertToPDFA
{

     private ConvertToPDFA()
     {
     }

     public static void main(String[] args) throws IOException, 
TransformerException
     {
         String file = "XXXX\\testme1.pdf";
         String file2 = "XXXX\\testme1-pdfa.pdf";

         try (PDDocument doc = Loader.loadPDF(new File(file)))
         {
             doc.setVersion(1.4f);
             // add XMP metadata
             XMPMetadata xmp = XMPMetadata.createXMPMetadata();

             try
             {
                 DublinCoreSchema dc = xmp.createAndAddDublinCoreSchema();
                 dc.setTitle(file);

                 PDFAIdentificationSchema id = 
xmp.createAndAddPDFAIdentificationSchema();
                 id.setPart(1);
                 id.setConformance("B");

                 PDDocumentInformation info = new PDDocumentInformation();
                 info.setCreator("PDFBox");
                 XMPBasicSchema basicSchema = 
xmp.createAndAddXMPBasicSchema();
                 basicSchema.setCreatorTool("PDFBox");

                 XmpSerializer serializer = new XmpSerializer();
                 ByteArrayOutputStream baos = new ByteArrayOutputStream();
                 serializer.serialize(xmp, baos, true);

                 PDMetadata metadata = new PDMetadata(doc);
                 metadata.importXMPMetadata(baos.toByteArray());
                 doc.getDocumentCatalog().setMetadata(metadata);
                 doc.setDocumentInformation(info);
             }
             catch (BadFieldValueException e)
             {
                 // won't happen here, as the provided value is valid
                 throw new IllegalArgumentException(e);
             }

             // sRGB output intent
             InputStream colorProfile = 
ConvertToPDFA.class.getResourceAsStream(
                     "/org/apache/pdfbox/resources/pdfa/sRGB.icc");
             PDOutputIntent intent = new PDOutputIntent(doc, colorProfile);
             intent.setInfo("sRGB IEC61966-2.1");
             intent.setOutputCondition("sRGB IEC61966-2.1");
             intent.setOutputConditionIdentifier("sRGB IEC61966-2.1");
             intent.setRegistryName("http://www.color.org");
             doc.getDocumentCatalog().addOutputIntent(intent);

             doc.save(file2);
         }
     }
}

Re: PDF to PDF/A conversion on java

Posted by CowwoC <co...@gmail.com>.
I'm going to need to do something like this in the near future. Are there
any good samples or documentation I can look at for this use-use?

Thanks,
Gili

On Sat, Dec 16, 2023, 05:03 Tilman Hausherr <TH...@t-online.de> wrote:

> On 21.11.2023 11:31, Kirandas vakkil wrote:
> > Hi All,
> >
> > Can you please share if there is any resource on converting EXISTING PDF
> to
> > PDF/A in java.
>
> There are commercial tools for this. PDFBox doesn't offer anything,
> however you can still do it if there are very few errors and you know
> how to fix them, and all files are from the same source. This is usually
> true for files from scanners. There you usually only have to add an
> output intent and the correct metadata.
>
> Tilman
>
>
>
> >
> > This will be of great help to me. Thanks in advance.
> >
> > Regards,
> > Apache Patron
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
>
>