You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by Guillaume Bailleul <gb...@gmail.com> on 2013/12/19 21:06:38 UTC

Re: To PDF/A converted JPG has corrupt XMP-Metdata

Hi,

Could please create and send me a main class doing the stuff ? I will
have a look to your problem.

KR,

Guillaume

On Thu, Nov 21, 2013 at 11:36 AM, Kaiser, Florian (Extern)
<Fl...@ofdka.bwl.de> wrote:
> Hi,
>
> using the infos from
> http://svn.apache.org/repos/asf/pdfbox/trunk/examples/src/main/java/org/
> apache/pdfbox/examples/pdfa/CreatePDFA.java and
> http://java2s.com/Open-Source/Java/PDF/PDFBox-1.4.0/org/apache/pdfbox/ex
> amples/pdmodel/ImageToPDF.java.htm I wrote my own program that converts
> JPG to PDF/A-1b with PdfBox 2.0.0-SNAPSHOT.
>
> All works fine but according to 3-Heights PDF Validator Online Tool and
> the PDF-Validation of PdfBox the PDF is not a valid PDF/A-1b.
> 3-Heights tells me
>
>         Validating file "PdfBox-pic23392.pdf" for conformance level
> pdfa-1b
>         XML line 1:1: Start tag expected, '<' not found.
>         The document does not conform to the requested standard.
>         The document's meta data is either missing or inconsistent or
> corrupt.
>
> and PdfBox-Validation
>
>         java.lang.AssertionError: File is not a valid PDF/A-1b:
>         Error on MetaData, Failed to parse [7.1]
>
> The source code of my program is
>
>         import java.io.IOException;
>         import java.io.InputStream;
>         import java.io.OutputStream;
>         import javax.xml.transform.TransformerException;
>         import org.apache.jempbox.xmp.XMPMetadata;
>         import org.apache.jempbox.xmp.pdfa.XMPSchemaPDFAId;
>         import org.apache.pdfbox.pdmodel.PDDocument;
>         import org.apache.pdfbox.pdmodel.PDDocumentCatalog;
>         import org.apache.pdfbox.pdmodel.PDPage;
>         import org.apache.pdfbox.pdmodel.common.PDMetadata;
>         import org.apache.pdfbox.pdmodel.edit.PDPageContentStream;
>         import org.apache.pdfbox.pdmodel.graphics.color.PDOutputIntent;
>         import org.apache.pdfbox.pdmodel.graphics.xobject.PDJpeg;
>         import
> org.apache.pdfbox.pdmodel.graphics.xobject.PDXObjectImage;
>
>         public class PdfBoxJpgConverter {
>
>             public void convert(final InputStream in, final OutputStream
> out) throws Exception {
>
>                 PDDocument doc = null;
>                 try {
>                     doc = new PDDocument();
>
>                     PDPage page = new PDPage(PDPage.PAGE_SIZE_A4);
>
>                     doc.addPage(page);
>
>                     PDXObjectImage ximage = new PDJpeg(doc, in);
>
>                     float scaleFactor = getFitToPageScaleFactor(page,
> ximage);
>
>                     PDPageContentStream contentStream = new
> PDPageContentStream(doc, page);
>
>                     contentStream.drawXObject(ximage, 0, 0, scaleFactor
> * ximage.getWidth(), scaleFactor * ximage.getHeight());
>
>                     writeXmpMetadata(doc);
>                     writeOutputIntent(doc);
>
>                     contentStream.close();
>                     doc.save(out);
>                 } finally {
>                     if (doc != null) {
>                         doc.close();
>                     }
>                 }
>
>             }
>
>             private float getFitToPageScaleFactor(final PDPage page,
> final PDXObjectImage image) {
>                 float pageWidth = page.getMediaBox().getWidth();
>                 float pageHeight = page.getMediaBox().getHeight();
>                 float imageWidth = image.getWidth();
>                 float imageHeight = image.getHeight();
>                 return Math.min(pageHeight / imageHeight, pageWidth /
> imageWidth);
>             }
>
>             private void writeXmpMetadata(final PDDocument document)
> throws IOException, TransformerException {
>                 PDDocumentCatalog catalog =
> document.getDocumentCatalog();
>                 PDMetadata metadata = new PDMetadata(document);
>                 XMPMetadata xmp = new XMPMetadata();
>
>                 XMPSchemaPDFAId pdfaid = new XMPSchemaPDFAId(xmp);
>                 pdfaid.setConformance("B");
>                 pdfaid.setPart(1);
>                 pdfaid.setAbout("");
>
>                 xmp.addSchema(pdfaid);
>                 metadata.importXMPMetadata(xmp);
>                 catalog.setMetadata(metadata);
>             }
>
>             private void writeOutputIntent(final PDDocument doc) throws
> Exception {
>                 PDDocumentCatalog cat = doc.getDocumentCatalog();
>
>                 InputStream colorProfile =
> PdfBoxJpgConverter.class.getResourceAsStream("/sRGB_IEC61966-2-1_black_s
> caled.icc");
>
>                 PDOutputIntent oi = new PDOutputIntent(doc,
> colorProfile);
>                 oi.setInfo("sRGB IEC61966-2.1");
>                 oi.setOutputCondition("sRGB IEC61966-2.1");
>                 oi.setOutputConditionIdentifier("sRGB IEC61966-2.1");
>                 oi.setRegistryName("http://www.color.org");
>
>                 cat.addOutputIntent(oi);
>             }
>
>         }
>
> Am I doing something wrong or is this a bug?
>
> Thanks for the help.
>
> Florian