You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by Esteban R <er...@hotmail.com> on 2017/07/27 15:49:20 UTC

XMP issue when using fop and shaded jar

I'm working in a project which uses both pdfbox and apache fop (https://xmlgraphics.apache.org/fop/)


CreatePDFA (from pdfbox examples) generates different (wrong?) XMP metadata when fop is included in the shaded jar. Otherwise (i.e.: if a simple jar is generated or if fop is not included, then metadata is ok). The issue seems to be related to the services provided by fop (more details below). How can I solve it? Is it ok to exclude the services provided by fop from the shaded jar?


You can find a sample project here:

<http://www.filedropper.com/tmpxmpissue>http://www.filedropper.com/tmpxmpissue_1


To reproduce you can use try.sh -included in the project- in a cygwin (Windows 10) environment -should work on linux- or you can:

mvn clean install

java -jar target/tmp_xmp_issue-1.0-SNAPSHOT.jar  out.pdf "Hello world" OpenSans-Regular.ttf
Open out.pdf with PDFDebugger and see the XMPMetadata:
<rdf:li lang="x-default">out.pdf</rdf:li>

If you don't include fop (i.e. use pom.xml.nofop) then you will get:
<rdf:li xml:lang="x-default">out.pdf</rdf:li>^M

The same will happen if you use pom.xml.noservices (which just excludes the services provided by fop).


I use java version "1.8.0_112" and pdfbox 2.0.7.

Re: XMP issue when using fop and shaded jar

Posted by Esteban R <er...@hotmail.com>.
Thanks Andreas. It doesn't seem to solve the issue but it may help. I'm  a newbie in the maven world, so, sorry if I misundestood something.


I have been trying to exclude the pdfbox dependency in fop (with the <exclusion(s)> tag) :

      <exclusions>
        <exclusion>
          <groupId>org.apache.pdfbox</groupId>
          <artifactId>fontbox</artifactId>
        </exclusion>
      </exclusions>

but it doesn't help.


In fact "mvn dependency:list" doesn't show pdfbox 2.0.4 (although it is listed in https://mvnrepository.com/artifact/org.apache.xmlgraphics/fop/2.2). Maybe I'm missing something?. This is the output from  mvn dependency:list :

[INFO] Scanning for projects...
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] Building tmp_xmp_issue 1.0-SNAPSHOT
[INFO] ------------------------------------------------------------------------
[INFO]
[INFO] --- maven-dependency-plugin:2.1:list (default-cli) @ tmp_xmp_issue ---
[INFO]
[INFO] The following files have been resolved:
[INFO]    commons-io:commons-io:jar:1.3.1:compile
[INFO]    commons-logging:commons-logging:jar:1.2:compile
[INFO]    junit:junit:jar:3.8.1:test
[INFO]    org.apache.avalon.framework:avalon-framework-api:jar:4.3.1:compile
[INFO]    org.apache.avalon.framework:avalon-framework-impl:jar:4.3.1:compile
[INFO]    org.apache.pdfbox:fontbox:jar:2.0.7:compile
[INFO]    org.apache.pdfbox:pdfbox:jar:2.0.7:compile
[INFO]    org.apache.pdfbox:pdfbox-debugger:jar:2.0.7:compile
[INFO]    org.apache.pdfbox:pdfbox-tools:jar:2.0.7:compile
[INFO]    org.apache.pdfbox:xmpbox:jar:2.0.7:compile
[INFO]    org.apache.xmlgraphics:batik-anim:jar:1.9:compile
[INFO]    org.apache.xmlgraphics:batik-awt-util:jar:1.9:compile
[INFO]    org.apache.xmlgraphics:batik-bridge:jar:1.9:compile
[INFO]    org.apache.xmlgraphics:batik-constants:jar:1.9:compile
[INFO]    org.apache.xmlgraphics:batik-css:jar:1.9:compile
[INFO]    org.apache.xmlgraphics:batik-dom:jar:1.9:compile
[INFO]    org.apache.xmlgraphics:batik-ext:jar:1.9:compile
[INFO]    org.apache.xmlgraphics:batik-extension:jar:1.9:compile
[INFO]    org.apache.xmlgraphics:batik-gvt:jar:1.9:compile
[INFO]    org.apache.xmlgraphics:batik-i18n:jar:1.9:compile
[INFO]    org.apache.xmlgraphics:batik-parser:jar:1.9:compile
[INFO]    org.apache.xmlgraphics:batik-script:jar:1.9:compile
[INFO]    org.apache.xmlgraphics:batik-svg-dom:jar:1.9:compile
[INFO]    org.apache.xmlgraphics:batik-svggen:jar:1.9:compile
[INFO]    org.apache.xmlgraphics:batik-transcoder:jar:1.9:compile
[INFO]    org.apache.xmlgraphics:batik-util:jar:1.9:compile
[INFO]    org.apache.xmlgraphics:batik-xml:jar:1.9:compile
[INFO]    org.apache.xmlgraphics:fop:jar:2.2:compile
[INFO]    org.apache.xmlgraphics:xmlgraphics-commons:jar:2.2:compile
[INFO]    xalan:serializer:jar:2.7.2:compile
[INFO]    xalan:xalan:jar:2.7.2:compile
[INFO]    xml-apis:xml-apis:jar:1.3.04:compile
[INFO]    xml-apis:xml-apis-ext:jar:1.3.04:compile
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 1.042s
[INFO] Finished at: Thu Jul 27 15:00:34 ART 2017
[INFO] Final Memory: 9M/155M
[INFO] ------------------------------------------------------------------------


Besides, it doesn't explain (at least for me) why just excluding the "services" from the META-INF solves the issue.

Esteban


________________________________
De: Andreas Lehmkuehler <an...@lehmi.de>
Enviado: jueves, 27 de julio de 2017 05:04 p.m.
Para: users@pdfbox.apache.org
Asunto: Re: XMP issue when using fop and shaded jar

fop uses PDFBox 2.0.4 so that there are 2 concurrent versions to be put into the
jar. And obviously the older one "wins"

Andreas

Re: XMP issue when using fop and shaded jar

Posted by Andreas Lehmkuehler <an...@lehmi.de>.
fop uses PDFBox 2.0.4 so that there are 2 concurrent versions to be put into the 
jar. And obviously the older one "wins"

Andreas

Am 27.07.2017 um 17:49 schrieb Esteban R:
> I'm working in a project which uses both pdfbox and apache fop (https://xmlgraphics.apache.org/fop/)
> 
> 
> CreatePDFA (from pdfbox examples) generates different (wrong?) XMP metadata when fop is included in the shaded jar. Otherwise (i.e.: if a simple jar is generated or if fop is not included, then metadata is ok). The issue seems to be related to the services provided by fop (more details below). How can I solve it? Is it ok to exclude the services provided by fop from the shaded jar?
> 
> 
> You can find a sample project here:
> 
> <http://www.filedropper.com/tmpxmpissue>http://www.filedropper.com/tmpxmpissue_1
> 
> 
> To reproduce you can use try.sh -included in the project- in a cygwin (Windows 10) environment -should work on linux- or you can:
> 
> mvn clean install
> 
> java -jar target/tmp_xmp_issue-1.0-SNAPSHOT.jar  out.pdf "Hello world" OpenSans-Regular.ttf
> Open out.pdf with PDFDebugger and see the XMPMetadata:
> <rdf:li lang="x-default">out.pdf</rdf:li>
> 
> If you don't include fop (i.e. use pom.xml.nofop) then you will get:
> <rdf:li xml:lang="x-default">out.pdf</rdf:li>^M
> 
> The same will happen if you use pom.xml.noservices (which just excludes the services provided by fop).
> 
> 
> I use java version "1.8.0_112" and pdfbox 2.0.7.
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Re: XMP issue when using fop and shaded jar

Posted by Esteban R <er...@hotmail.com>.
Thanks Tilman!


I have used the modified XMPSerializer according to your advice and that solved the issue.


Esteban Ruiz

Re: XMP issue when using fop and shaded jar

Posted by Tilman Hausherr <TH...@t-online.de>.
Am 27.07.2017 um 17:49 schrieb Esteban R:
> I'm working in a project which uses both pdfbox and apache fop (https://xmlgraphics.apache.org/fop/)
>
>
> CreatePDFA (from pdfbox examples) generates different (wrong?) XMP metadata when fop is included in the shaded jar. Otherwise (i.e.: if a simple jar is generated or if fop is not included, then metadata is ok). The issue seems to be related to the services provided by fop (more details below). How can I solve it? Is it ok to exclude the services provided by fop from the shaded jar?
>
>
> You can find a sample project here:
>
> <http://www.filedropper.com/tmpxmpissue>http://www.filedropper.com/tmpxmpissue_1
>
>
> To reproduce you can use try.sh -included in the project- in a cygwin (Windows 10) environment -should work on linux- or you can:
>
> mvn clean install
>
> java -jar target/tmp_xmp_issue-1.0-SNAPSHOT.jar  out.pdf "Hello world" OpenSans-Regular.ttf
> Open out.pdf with PDFDebugger and see the XMPMetadata:
> <rdf:li lang="x-default">out.pdf</rdf:li>
>
> If you don't include fop (i.e. use pom.xml.nofop) then you will get:
> <rdf:li xml:lang="x-default">out.pdf</rdf:li>^M
>
> The same will happen if you use pom.xml.noservices (which just excludes the services provided by fop).
>
>
> I use java version "1.8.0_112" and pdfbox 2.0.7.
>

In Serializer, the namespace is set with

esimple.setAttributeNS(attribute.getNamespace(), attribute.getName(), 
attribute.getValue());

I traced it, the parameters are

http://www.w3.org/XML/1998/namespace
lang
x-default

so all is good there. It's the transformer who is the problem.

This code in Serializer.java:

Transformer transformer = TransformerFactory.newInstance().newTransformer();

returns a com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl 
class. If fop is there, it returns a 
org.apache.xalan.transformer.TransformerIdentityImpl class.

javadoc:
https://docs.oracle.com/javase/7/docs/api/javax/xml/transform/TransformerFactory.html#newInstance()
"The Services API will look for a classname in the file 
META-INF/services/javax.xml.transform.TransformerFactory in jars 
available to the runtime."

fop uses xalan-2.7.2.jar and that one has META-INF.services with 
javax.xml.transform.TransformerFactory, and that one contains 
org.apache.xalan.processor.TransformerFactoryImpl.

So the question is now, who is right? Is it a bug in xalan or in the 
calling code, i.e. do we have to set some option?

We could force the default implementation by changing the code in "save" to

Transformer transformer = 
TransformerFactory.newInstance("com.sun.org.apache.xalan.internal.xsltc.trax.TransformerFactoryImpl", 
null).newTransformer();

Or you can set a system property:

System.setProperty("javax.xml.transform.TransformerFactory", 
"com.sun.org.apache.xalan.internal.xsltc.trax.TransformerFactoryImpl");

However maybe this will mess up something in FOP :-(

So I guess the best for you would be to copy the source code of 
Serializer, and to change the newInstance call as described. Make sure 
this works for all java versions.

Tilman


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org