You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by Roberto Nibali <rn...@gmail.com> on 2016/01/27 15:08:35 UTC

NPE through NFE in preflight due to "questionable" header entry

Hi

The PDF uploaded to http://www.filedropper.com/6biasebm seems to have a
broken header section. It opens well with Preview, Adobe and others,
however preflight repeatedly trips pretty badly with a NFE:

log4j: Adding appender named [console] to category [root].
2016-01-27 14:52:38 DEBUG COSParser:1879 - Can't parse the header version.
java.lang.NumberFormatException: For input string: "1.\"
        at
sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:2043)
        at sun.misc.FloatingDecimal.parseFloat(FloatingDecimal.java:122)
        at java.lang.Float.parseFloat(Float.java:451)
        at
org.apache.pdfbox.pdfparser.COSParser.parseHeader(COSParser.java:1874)
        at
org.apache.pdfbox.pdfparser.COSParser.parsePDFHeader(COSParser.java:1801)
        at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:242)
        at
org.apache.pdfbox.preflight.parser.PreflightParser.parse(PreflightParser.java:208)
        at
org.apache.pdfbox.preflight.parser.PreflightParser.parse(PreflightParser.java:190)
        at
org.apache.pdfbox.preflight.parser.PreflightParser.parse(PreflightParser.java:178)
        at
org.apache.pdfbox.preflight.parser.XmlResultParser.validate(XmlResultParser.java:70)
        at
org.apache.pdfbox.preflight.parser.XmlResultParser.validate(XmlResultParser.java:51)
        at
org.apache.pdfbox.preflight.Validator_A1b.main(Validator_A1b.java:121)
2016-01-27 14:52:38 DEBUG COSParser:1879 - Can't parse the header version.
java.lang.NumberFormatException: For input string: "1.\"
        at
sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:2043)
        at sun.misc.FloatingDecimal.parseFloat(FloatingDecimal.java:122)
        at java.lang.Float.parseFloat(Float.java:451)
        at
org.apache.pdfbox.pdfparser.COSParser.parseHeader(COSParser.java:1874)
        at
org.apache.pdfbox.pdfparser.COSParser.parsePDFHeader(COSParser.java:1801)
        at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:242)
        at
org.apache.pdfbox.preflight.parser.PreflightParser.parse(PreflightParser.java:208)
        at
org.apache.pdfbox.preflight.parser.PreflightParser.parse(PreflightParser.java:190)
        at
org.apache.pdfbox.preflight.parser.PreflightParser.parse(PreflightParser.java:178)
        at
org.apache.pdfbox.preflight.parser.XmlResultParser.validate(XmlResultParser.java:70)
        at
org.apache.pdfbox.preflight.parser.XmlResultParser.validate(XmlResultParser.java:51)
        at
org.apache.pdfbox.preflight.Validator_A1b.main(Validator_A1b.java:121)
2016-01-27 14:52:38 DEBUG COSParser:1879 - Can't parse the header version.
java.lang.NumberFormatException: For input string: "1.\"
        at
sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:2043)
        at sun.misc.FloatingDecimal.parseFloat(FloatingDecimal.java:122)
        at java.lang.Float.parseFloat(Float.java:451)
        at
org.apache.pdfbox.pdfparser.COSParser.parseHeader(COSParser.java:1874)
        at
org.apache.pdfbox.pdfparser.COSParser.parsePDFHeader(COSParser.java:1801)
        at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:242)
        at
org.apache.pdfbox.preflight.parser.PreflightParser.parse(PreflightParser.java:208)
        at
org.apache.pdfbox.preflight.parser.PreflightParser.parse(PreflightParser.java:190)
        at
org.apache.pdfbox.preflight.parser.PreflightParser.parse(PreflightParser.java:178)
        at
org.apache.pdfbox.preflight.parser.XmlResultParser.validate(XmlResultParser.java:70)
        at
org.apache.pdfbox.preflight.parser.XmlResultParser.validate(XmlResultParser.java:51)
        at
org.apache.pdfbox.preflight.Validator_A1b.main(Validator_A1b.java:121)
[...]

Notwithstanding the issue at hand with the header mismatch, I believe we
should not throw a repetitive NFE here.

Further down the road the festival results in an abundance of the following:

2016-01-27 14:52:41 DEBUG ScratchFileBuffer:516 - ScratchFileBuffer not
closed!
2016-01-27 14:52:41 DEBUG ScratchFileBuffer:516 - ScratchFileBuffer not
closed!
2016-01-27 14:52:41 DEBUG ScratchFileBuffer:516 - ScratchFileBuffer not
closed!
2016-01-27 14:52:41 DEBUG ScratchFileBuffer:516 - ScratchFileBuffer not
closed!
2016-01-27 14:52:41 DEBUG ScratchFileBuffer:516 - ScratchFileBuffer not
closed!
2016-01-27 14:52:41 DEBUG ScratchFileBuffer:516 - ScratchFileBuffer not
closed!
2016-01-27 14:52:41 DEBUG ScratchFileBuffer:516 - ScratchFileBuffer not
closed!
2016-01-27 14:52:41 DEBUG ScratchFileBuffer:516 - ScratchFileBuffer not
closed!
2016-01-27 14:52:41 DEBUG ScratchFileBuffer:516 - ScratchFileBuffer not
closed!
2016-01-27 14:52:41 DEBUG ScratchFileBuffer:516 - ScratchFileBuffer not
closed!
2016-01-27 14:52:41 DEBUG ScratchFileBuffer:516 - ScratchFileBuffer not
closed!
2016-01-27 14:52:41 DEBUG ScratchFileBuffer:516 - ScratchFileBuffer not
closed!
2016-01-27 14:52:41 DEBUG ScratchFileBuffer:516 - ScratchFileBuffer not
closed!
[...]

Code used: SVN head of today. Should the header parsing be relaxed to match
Adobe's interpretation?

Adobe's Preflight gives me pointers with regard to the following traversal
path:

[image: Inline image 1]

PDFBox's preflight returns the following XML report:

[image: Inline image 2]

Cheers

Roberto

Re: NPE through NFE in preflight due to "questionable" header entry

Posted by Tilman Hausherr <TH...@t-online.de>.
Am 28.01.2016 um 01:03 schrieb Roberto Nibali:
> I decided to run a quick regression test using the following suites (317
> PDF files in total):
>
>     - The cabinet of horrors PDF suite:
>     http://opf-labs.org/format-corpus/pdfCabinetOfHorrors/   (get the PDFs: 'wget
>     -A pdf -m -p -E -k -K -np
>     http://opf-labs.org/format-corpus/pdfCabinetOfHorrors/')
>     - The Bavaria report test suite:
>     http://www.pdflib.com/fileadmin/pdflib/Bavaria/2009-04-03-Bavaria-pdfa.zip
>     - The Isartor report test suite:
>     http://www.pdfa.org/wp-content/uploads/2011/08/isartor-pdfa-2008-08-13.zip
>
> Basically, no NPE was triggered anymore from what I can see. The Isartor
> test suite is a bit crazy, so preflight spat some 'java.io.IOException:
> head is mandatory' exceptions; none of them affected the final reports.
>
> Other than that, preflight looks pretty ok.

Thanks, but the Bavaria and Isartor test suites are already run on the 
build server :-) The Bavaria test suite is disabled for ordinary users 
so that they don't get scared by a build process that takes 5 minutes; 
it can be enabled with

-Dskip-bavaria=false

There's currently no way to check the cabinet of horrors, because the 
tests expect a zip file.

Tilman

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Re: NPE through NFE in preflight due to "questionable" header entry

Posted by Roberto Nibali <rn...@gmail.com>.
Addendum:

On Wed, Jan 27, 2016 at 11:53 PM, Roberto Nibali <rn...@gmail.com> wrote:

> Hi Tilman
>
>
>


> On Wed, Jan 27, 2016 at 11:51 PM, Tilman Hausherr <TH...@t-online.de>
> wrote:
>
>> Am 27.01.2016 um 18:48 schrieb Tilman Hausherr:
>>
>>> Am 27.01.2016 um 15:08 schrieb Roberto Nibali:
>>>
>>>>
>>>>
>>>> The PDF uploaded to http://www.filedropper.com/6biasebm seems to have
>>>> a broken header section. It opens well with Preview, Adobe and others,
>>>> however preflight repeatedly trips pretty badly with a NFE:
>>>>
>>>> log4j: Adding appender named [console] to category [root].
>>>> 2016-01-27 14:52:38 DEBUG COSParser:1879 - Can't parse the header
>>>> version.
>>>> java.lang.NumberFormatException: For input string: "1.\"
>>>>         at
>>>> sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:2043)
>>>>
>>>
>>> That's only in debugging that this is shown. The NFE is retrown as an
>>> IOException and this should result in a normal preflight error.
>>>
>>> However I don't get this with your file. Instead I get this, which is
>>> even worse:
>>>
>>> Exception in thread "main" java.lang.NullPointerException
>>>     at
>>> org.apache.xmpbox.xml.PdfaExtensionHelper.populatePDFAPropertyType(PdfaExtensionHelper.java:180)
>>>
>>
>> This has been fixed, output is now several errors including
>>
>> 7.6 : Error on MetaData, Unknown property value type : Seq Text
>>
>>
> Yep, it's fixed with your last commit. Just verified on my side.
>
>
I decided to run a quick regression test using the following suites (317
PDF files in total):

   - The cabinet of horrors PDF suite:
   http://opf-labs.org/format-corpus/pdfCabinetOfHorrors/  (get the PDFs: 'wget
   -A pdf -m -p -E -k -K -np
   http://opf-labs.org/format-corpus/pdfCabinetOfHorrors/')
   - The Bavaria report test suite:
   http://www.pdflib.com/fileadmin/pdflib/Bavaria/2009-04-03-Bavaria-pdfa.zip
   - The Isartor report test suite:
   http://www.pdfa.org/wp-content/uploads/2011/08/isartor-pdfa-2008-08-13.zip

Basically, no NPE was triggered anymore from what I can see. The Isartor
test suite is a bit crazy, so preflight spat some 'java.io.IOException:
head is mandatory' exceptions; none of them affected the final reports.

Other than that, preflight looks pretty ok.

Cheers

Roberto

Re: NPE through NFE in preflight due to "questionable" header entry

Posted by Roberto Nibali <rn...@gmail.com>.
Hi Tilman

On Wed, Jan 27, 2016 at 11:51 PM, Tilman Hausherr <TH...@t-online.de>
wrote:

> Am 27.01.2016 um 18:48 schrieb Tilman Hausherr:
>
>> Am 27.01.2016 um 15:08 schrieb Roberto Nibali:
>>
>>>
>>>
>>> The PDF uploaded to http://www.filedropper.com/6biasebm seems to have a
>>> broken header section. It opens well with Preview, Adobe and others,
>>> however preflight repeatedly trips pretty badly with a NFE:
>>>
>>> log4j: Adding appender named [console] to category [root].
>>> 2016-01-27 14:52:38 DEBUG COSParser:1879 - Can't parse the header
>>> version.
>>> java.lang.NumberFormatException: For input string: "1.\"
>>>         at
>>> sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:2043)
>>>
>>
>> That's only in debugging that this is shown. The NFE is retrown as an
>> IOException and this should result in a normal preflight error.
>>
>> However I don't get this with your file. Instead I get this, which is
>> even worse:
>>
>> Exception in thread "main" java.lang.NullPointerException
>>     at
>> org.apache.xmpbox.xml.PdfaExtensionHelper.populatePDFAPropertyType(PdfaExtensionHelper.java:180)
>>
>
> This has been fixed, output is now several errors including
>
> 7.6 : Error on MetaData, Unknown property value type : Seq Text
>
>
Yep, it's fixed with your last commit. Just verified on my side.

Thanks and best regards

Roberto

Re: NPE through NFE in preflight due to "questionable" header entry

Posted by Tilman Hausherr <TH...@t-online.de>.
Am 27.01.2016 um 18:48 schrieb Tilman Hausherr:
> Am 27.01.2016 um 15:08 schrieb Roberto Nibali:
>>
>>
>> The PDF uploaded to http://www.filedropper.com/6biasebm seems to have 
>> a broken header section. It opens well with Preview, Adobe and 
>> others, however preflight repeatedly trips pretty badly with a NFE:
>>
>> log4j: Adding appender named [console] to category [root].
>> 2016-01-27 14:52:38 DEBUG COSParser:1879 - Can't parse the header 
>> version.
>> java.lang.NumberFormatException: For input string: "1.\"
>>         at 
>> sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:2043)
>
> That's only in debugging that this is shown. The NFE is retrown as an 
> IOException and this should result in a normal preflight error.
>
> However I don't get this with your file. Instead I get this, which is 
> even worse:
>
> Exception in thread "main" java.lang.NullPointerException
>     at 
> org.apache.xmpbox.xml.PdfaExtensionHelper.populatePDFAPropertyType(PdfaExtensionHelper.java:180)

This has been fixed, output is now several errors including

7.6 : Error on MetaData, Unknown property value type : Seq Text



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Re: NPE through NFE in preflight due to "questionable" header entry

Posted by Tilman Hausherr <TH...@t-online.de>.
Am 27.01.2016 um 15:08 schrieb Roberto Nibali:
>
>
> The PDF uploaded to http://www.filedropper.com/6biasebm seems to have 
> a broken header section. It opens well with Preview, Adobe and others, 
> however preflight repeatedly trips pretty badly with a NFE:
>
> log4j: Adding appender named [console] to category [root].
> 2016-01-27 14:52:38 DEBUG COSParser:1879 - Can't parse the header version.
> java.lang.NumberFormatException: For input string: "1.\"
>         at 
> sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:2043)

That's only in debugging that this is shown. The NFE is retrown as an 
IOException and this should result in a normal preflight error.

However I don't get this with your file. Instead I get this, which is 
even worse:

Exception in thread "main" java.lang.NullPointerException
     at 
org.apache.xmpbox.xml.PdfaExtensionHelper.populatePDFAPropertyType(PdfaExtensionHelper.java:180)
     at 
org.apache.xmpbox.xml.PdfaExtensionHelper.populatePDFASchemaType(PdfaExtensionHelper.java:159)
     at 
org.apache.xmpbox.xml.PdfaExtensionHelper.populateSchemaMapping(PdfaExtensionHelper.java:116)
     at org.apache.xmpbox.xml.DomXmpParser.parse(DomXmpParser.java:194)
     at 
org.apache.pdfbox.preflight.process.MetadataValidationProcess.validate(MetadataValidationProcess.java:69)
     at 
org.apache.pdfbox.preflight.utils.ContextHelper.callValidation(ContextHelper.java:84)
     at 
org.apache.pdfbox.preflight.utils.ContextHelper.validateElement(ContextHelper.java:122)
     at 
org.apache.pdfbox.preflight.PreflightDocument.validate(PreflightDocument.java:163)
     at 
org.apache.pdfbox.preflight.Validator_A1b.runSimple(Validator_A1b.java:174)
     at 
org.apache.pdfbox.preflight.Validator_A1b.main(Validator_A1b.java:135)