You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by Ad...@swmc.com on 2010/08/24 23:24:42 UTC

PDF Validator

Everyone,

I've been dealing with a lot of out of spec PDFs lately, as well as seeing 
things on the mailing list like instances of "-" and "." where there 
should be numbers.  Would anyone else find it helpful to have a 
"nonConforming" flag that would default to false and be set to true any 
time we find any of these errors (and perhaps add an error message to a 
List<String>)?  This would make it easy to determine if a PDF is invalid. 
This could be used not only in our own test cases (to make sure the PDFs 
we output follow the spec), but also by other projects like iText, 
ScanSoft, and other maintainers of PDF producing software.  The better 
PDFs they produce, the easier it is for everyone to deal with them!

It'd be nice to be the defacto standard for PDF validation, and we're 
already dealing with and usually logging these issues as it is, so it 
doesn't seem like it'd be that much work to add this.  All we have to do 
is put a command line utility in there which loads the PDF and then sees 
if it's valid or not and outputs the specific errors if it's not.

What do you think?  Would this be something you would find useful?

---- 
Thanks,
Adam
?  Click here to submit conditions  

This email and any content within or attached hereto from  Sun West Mortgage Company, Inc.  is confidential and/or legally privileged. The information is intended only for the use of the individual or entity named on this email. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution or the taking of any action in reliance on the contents of this email information is strictly prohibited, and that the documents should be returned to this office immediately by email. Receipt by anyone other than the intended recipient is not a waiver of any privilege. Please do not include your social security number, account number, or any other personal or financial information in the content of the email. Should you have any questions, please call  (800) 453 7884.   

Re: PDF Validator

Posted by Johannes Koch <jo...@fit.fraunhofer.de>.
Am 24.8.2010 23:24, schrieb Adam@swmc.com:
> Everyone,
>
> I've been dealing with a lot of out of spec PDFs lately, as well as seeing
> things on the mailing list like instances of "-" and "." where there
> should be numbers.  Would anyone else find it helpful to have a
> "nonConforming" flag that would default to false and be set to true any
> time we find any of these errors (and perhaps add an error message to a
> List<String>)?

Or instroduce some ErrorHandler like in SAX and SAC?

-- 
Johannes Koch
Fraunhofer Institute for Applied Information Technology FIT
Web Compliance Center
Schloss Birlinghoven, D-53757 Sankt Augustin, Germany
Phone: +49-2241-142628    Fax: +49-2241-142065