You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by Roberto Nibali <rn...@gmail.com> on 2016/01/27 15:08:35 UTC
NPE through NFE in preflight due to "questionable" header entry
Hi
The PDF uploaded to http://www.filedropper.com/6biasebm seems to have a
broken header section. It opens well with Preview, Adobe and others,
however preflight repeatedly trips pretty badly with a NFE:
log4j: Adding appender named [console] to category [root].
2016-01-27 14:52:38 DEBUG COSParser:1879 - Can't parse the header version.
java.lang.NumberFormatException: For input string: "1.\"
at
sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:2043)
at sun.misc.FloatingDecimal.parseFloat(FloatingDecimal.java:122)
at java.lang.Float.parseFloat(Float.java:451)
at
org.apache.pdfbox.pdfparser.COSParser.parseHeader(COSParser.java:1874)
at
org.apache.pdfbox.pdfparser.COSParser.parsePDFHeader(COSParser.java:1801)
at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:242)
at
org.apache.pdfbox.preflight.parser.PreflightParser.parse(PreflightParser.java:208)
at
org.apache.pdfbox.preflight.parser.PreflightParser.parse(PreflightParser.java:190)
at
org.apache.pdfbox.preflight.parser.PreflightParser.parse(PreflightParser.java:178)
at
org.apache.pdfbox.preflight.parser.XmlResultParser.validate(XmlResultParser.java:70)
at
org.apache.pdfbox.preflight.parser.XmlResultParser.validate(XmlResultParser.java:51)
at
org.apache.pdfbox.preflight.Validator_A1b.main(Validator_A1b.java:121)
2016-01-27 14:52:38 DEBUG COSParser:1879 - Can't parse the header version.
java.lang.NumberFormatException: For input string: "1.\"
at
sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:2043)
at sun.misc.FloatingDecimal.parseFloat(FloatingDecimal.java:122)
at java.lang.Float.parseFloat(Float.java:451)
at
org.apache.pdfbox.pdfparser.COSParser.parseHeader(COSParser.java:1874)
at
org.apache.pdfbox.pdfparser.COSParser.parsePDFHeader(COSParser.java:1801)
at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:242)
at
org.apache.pdfbox.preflight.parser.PreflightParser.parse(PreflightParser.java:208)
at
org.apache.pdfbox.preflight.parser.PreflightParser.parse(PreflightParser.java:190)
at
org.apache.pdfbox.preflight.parser.PreflightParser.parse(PreflightParser.java:178)
at
org.apache.pdfbox.preflight.parser.XmlResultParser.validate(XmlResultParser.java:70)
at
org.apache.pdfbox.preflight.parser.XmlResultParser.validate(XmlResultParser.java:51)
at
org.apache.pdfbox.preflight.Validator_A1b.main(Validator_A1b.java:121)
2016-01-27 14:52:38 DEBUG COSParser:1879 - Can't parse the header version.
java.lang.NumberFormatException: For input string: "1.\"
at
sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:2043)
at sun.misc.FloatingDecimal.parseFloat(FloatingDecimal.java:122)
at java.lang.Float.parseFloat(Float.java:451)
at
org.apache.pdfbox.pdfparser.COSParser.parseHeader(COSParser.java:1874)
at
org.apache.pdfbox.pdfparser.COSParser.parsePDFHeader(COSParser.java:1801)
at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:242)
at
org.apache.pdfbox.preflight.parser.PreflightParser.parse(PreflightParser.java:208)
at
org.apache.pdfbox.preflight.parser.PreflightParser.parse(PreflightParser.java:190)
at
org.apache.pdfbox.preflight.parser.PreflightParser.parse(PreflightParser.java:178)
at
org.apache.pdfbox.preflight.parser.XmlResultParser.validate(XmlResultParser.java:70)
at
org.apache.pdfbox.preflight.parser.XmlResultParser.validate(XmlResultParser.java:51)
at
org.apache.pdfbox.preflight.Validator_A1b.main(Validator_A1b.java:121)
[...]
Notwithstanding the issue at hand with the header mismatch, I believe we
should not throw a repetitive NFE here.
Further down the road the festival results in an abundance of the following:
2016-01-27 14:52:41 DEBUG ScratchFileBuffer:516 - ScratchFileBuffer not
closed!
2016-01-27 14:52:41 DEBUG ScratchFileBuffer:516 - ScratchFileBuffer not
closed!
2016-01-27 14:52:41 DEBUG ScratchFileBuffer:516 - ScratchFileBuffer not
closed!
2016-01-27 14:52:41 DEBUG ScratchFileBuffer:516 - ScratchFileBuffer not
closed!
2016-01-27 14:52:41 DEBUG ScratchFileBuffer:516 - ScratchFileBuffer not
closed!
2016-01-27 14:52:41 DEBUG ScratchFileBuffer:516 - ScratchFileBuffer not
closed!
2016-01-27 14:52:41 DEBUG ScratchFileBuffer:516 - ScratchFileBuffer not
closed!
2016-01-27 14:52:41 DEBUG ScratchFileBuffer:516 - ScratchFileBuffer not
closed!
2016-01-27 14:52:41 DEBUG ScratchFileBuffer:516 - ScratchFileBuffer not
closed!
2016-01-27 14:52:41 DEBUG ScratchFileBuffer:516 - ScratchFileBuffer not
closed!
2016-01-27 14:52:41 DEBUG ScratchFileBuffer:516 - ScratchFileBuffer not
closed!
2016-01-27 14:52:41 DEBUG ScratchFileBuffer:516 - ScratchFileBuffer not
closed!
2016-01-27 14:52:41 DEBUG ScratchFileBuffer:516 - ScratchFileBuffer not
closed!
[...]
Code used: SVN head of today. Should the header parsing be relaxed to match
Adobe's interpretation?
Adobe's Preflight gives me pointers with regard to the following traversal
path:
[image: Inline image 1]
PDFBox's preflight returns the following XML report:
[image: Inline image 2]
Cheers
Roberto
Re: NPE through NFE in preflight due to "questionable" header entry
Posted by Tilman Hausherr <TH...@t-online.de>.
Am 28.01.2016 um 01:03 schrieb Roberto Nibali:
> I decided to run a quick regression test using the following suites (317
> PDF files in total):
>
> - The cabinet of horrors PDF suite:
> http://opf-labs.org/format-corpus/pdfCabinetOfHorrors/ (get the PDFs: 'wget
> -A pdf -m -p -E -k -K -np
> http://opf-labs.org/format-corpus/pdfCabinetOfHorrors/')
> - The Bavaria report test suite:
> http://www.pdflib.com/fileadmin/pdflib/Bavaria/2009-04-03-Bavaria-pdfa.zip
> - The Isartor report test suite:
> http://www.pdfa.org/wp-content/uploads/2011/08/isartor-pdfa-2008-08-13.zip
>
> Basically, no NPE was triggered anymore from what I can see. The Isartor
> test suite is a bit crazy, so preflight spat some 'java.io.IOException:
> head is mandatory' exceptions; none of them affected the final reports.
>
> Other than that, preflight looks pretty ok.
Thanks, but the Bavaria and Isartor test suites are already run on the
build server :-) The Bavaria test suite is disabled for ordinary users
so that they don't get scared by a build process that takes 5 minutes;
it can be enabled with
-Dskip-bavaria=false
There's currently no way to check the cabinet of horrors, because the
tests expect a zip file.
Tilman
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org
Re: NPE through NFE in preflight due to "questionable" header entry
Posted by Roberto Nibali <rn...@gmail.com>.
Addendum:
On Wed, Jan 27, 2016 at 11:53 PM, Roberto Nibali <rn...@gmail.com> wrote:
> Hi Tilman
>
>
>
> On Wed, Jan 27, 2016 at 11:51 PM, Tilman Hausherr <TH...@t-online.de>
> wrote:
>
>> Am 27.01.2016 um 18:48 schrieb Tilman Hausherr:
>>
>>> Am 27.01.2016 um 15:08 schrieb Roberto Nibali:
>>>
>>>>
>>>>
>>>> The PDF uploaded to http://www.filedropper.com/6biasebm seems to have
>>>> a broken header section. It opens well with Preview, Adobe and others,
>>>> however preflight repeatedly trips pretty badly with a NFE:
>>>>
>>>> log4j: Adding appender named [console] to category [root].
>>>> 2016-01-27 14:52:38 DEBUG COSParser:1879 - Can't parse the header
>>>> version.
>>>> java.lang.NumberFormatException: For input string: "1.\"
>>>> at
>>>> sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:2043)
>>>>
>>>
>>> That's only in debugging that this is shown. The NFE is retrown as an
>>> IOException and this should result in a normal preflight error.
>>>
>>> However I don't get this with your file. Instead I get this, which is
>>> even worse:
>>>
>>> Exception in thread "main" java.lang.NullPointerException
>>> at
>>> org.apache.xmpbox.xml.PdfaExtensionHelper.populatePDFAPropertyType(PdfaExtensionHelper.java:180)
>>>
>>
>> This has been fixed, output is now several errors including
>>
>> 7.6 : Error on MetaData, Unknown property value type : Seq Text
>>
>>
> Yep, it's fixed with your last commit. Just verified on my side.
>
>
I decided to run a quick regression test using the following suites (317
PDF files in total):
- The cabinet of horrors PDF suite:
http://opf-labs.org/format-corpus/pdfCabinetOfHorrors/ (get the PDFs: 'wget
-A pdf -m -p -E -k -K -np
http://opf-labs.org/format-corpus/pdfCabinetOfHorrors/')
- The Bavaria report test suite:
http://www.pdflib.com/fileadmin/pdflib/Bavaria/2009-04-03-Bavaria-pdfa.zip
- The Isartor report test suite:
http://www.pdfa.org/wp-content/uploads/2011/08/isartor-pdfa-2008-08-13.zip
Basically, no NPE was triggered anymore from what I can see. The Isartor
test suite is a bit crazy, so preflight spat some 'java.io.IOException:
head is mandatory' exceptions; none of them affected the final reports.
Other than that, preflight looks pretty ok.
Cheers
Roberto
Re: NPE through NFE in preflight due to "questionable" header entry
Posted by Roberto Nibali <rn...@gmail.com>.
Hi Tilman
On Wed, Jan 27, 2016 at 11:51 PM, Tilman Hausherr <TH...@t-online.de>
wrote:
> Am 27.01.2016 um 18:48 schrieb Tilman Hausherr:
>
>> Am 27.01.2016 um 15:08 schrieb Roberto Nibali:
>>
>>>
>>>
>>> The PDF uploaded to http://www.filedropper.com/6biasebm seems to have a
>>> broken header section. It opens well with Preview, Adobe and others,
>>> however preflight repeatedly trips pretty badly with a NFE:
>>>
>>> log4j: Adding appender named [console] to category [root].
>>> 2016-01-27 14:52:38 DEBUG COSParser:1879 - Can't parse the header
>>> version.
>>> java.lang.NumberFormatException: For input string: "1.\"
>>> at
>>> sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:2043)
>>>
>>
>> That's only in debugging that this is shown. The NFE is retrown as an
>> IOException and this should result in a normal preflight error.
>>
>> However I don't get this with your file. Instead I get this, which is
>> even worse:
>>
>> Exception in thread "main" java.lang.NullPointerException
>> at
>> org.apache.xmpbox.xml.PdfaExtensionHelper.populatePDFAPropertyType(PdfaExtensionHelper.java:180)
>>
>
> This has been fixed, output is now several errors including
>
> 7.6 : Error on MetaData, Unknown property value type : Seq Text
>
>
Yep, it's fixed with your last commit. Just verified on my side.
Thanks and best regards
Roberto
Re: NPE through NFE in preflight due to "questionable" header entry
Posted by Tilman Hausherr <TH...@t-online.de>.
Am 27.01.2016 um 18:48 schrieb Tilman Hausherr:
> Am 27.01.2016 um 15:08 schrieb Roberto Nibali:
>>
>>
>> The PDF uploaded to http://www.filedropper.com/6biasebm seems to have
>> a broken header section. It opens well with Preview, Adobe and
>> others, however preflight repeatedly trips pretty badly with a NFE:
>>
>> log4j: Adding appender named [console] to category [root].
>> 2016-01-27 14:52:38 DEBUG COSParser:1879 - Can't parse the header
>> version.
>> java.lang.NumberFormatException: For input string: "1.\"
>> at
>> sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:2043)
>
> That's only in debugging that this is shown. The NFE is retrown as an
> IOException and this should result in a normal preflight error.
>
> However I don't get this with your file. Instead I get this, which is
> even worse:
>
> Exception in thread "main" java.lang.NullPointerException
> at
> org.apache.xmpbox.xml.PdfaExtensionHelper.populatePDFAPropertyType(PdfaExtensionHelper.java:180)
This has been fixed, output is now several errors including
7.6 : Error on MetaData, Unknown property value type : Seq Text
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org
Re: NPE through NFE in preflight due to "questionable" header entry
Posted by Tilman Hausherr <TH...@t-online.de>.
Am 27.01.2016 um 15:08 schrieb Roberto Nibali:
>
>
> The PDF uploaded to http://www.filedropper.com/6biasebm seems to have
> a broken header section. It opens well with Preview, Adobe and others,
> however preflight repeatedly trips pretty badly with a NFE:
>
> log4j: Adding appender named [console] to category [root].
> 2016-01-27 14:52:38 DEBUG COSParser:1879 - Can't parse the header version.
> java.lang.NumberFormatException: For input string: "1.\"
> at
> sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:2043)
That's only in debugging that this is shown. The NFE is retrown as an
IOException and this should result in a normal preflight error.
However I don't get this with your file. Instead I get this, which is
even worse:
Exception in thread "main" java.lang.NullPointerException
at
org.apache.xmpbox.xml.PdfaExtensionHelper.populatePDFAPropertyType(PdfaExtensionHelper.java:180)
at
org.apache.xmpbox.xml.PdfaExtensionHelper.populatePDFASchemaType(PdfaExtensionHelper.java:159)
at
org.apache.xmpbox.xml.PdfaExtensionHelper.populateSchemaMapping(PdfaExtensionHelper.java:116)
at org.apache.xmpbox.xml.DomXmpParser.parse(DomXmpParser.java:194)
at
org.apache.pdfbox.preflight.process.MetadataValidationProcess.validate(MetadataValidationProcess.java:69)
at
org.apache.pdfbox.preflight.utils.ContextHelper.callValidation(ContextHelper.java:84)
at
org.apache.pdfbox.preflight.utils.ContextHelper.validateElement(ContextHelper.java:122)
at
org.apache.pdfbox.preflight.PreflightDocument.validate(PreflightDocument.java:163)
at
org.apache.pdfbox.preflight.Validator_A1b.runSimple(Validator_A1b.java:174)
at
org.apache.pdfbox.preflight.Validator_A1b.main(Validator_A1b.java:135)